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IMMUNOGENIC COMPOSITIONS FOR STREPTOCOCCUS PYOGENES 
All documents cited herein are incorporated by reference in their entirety. 

TECHNICAL FIELD 

This invention is in the fields of immunology and vaccinology. In particular, it relates to antigens 
5 derived from Streptococcus pyogenes and their use in immunisation. 

BACKGROUND ART 

Group A streptococcus ("GAS", S.pyogenes) is a frequent human pathogen, estimated to be 
present in between 5-15% of normal individuals without signs of disease. When host defences 
are compromised, or when the organism is able to exert its virulence, or when it is introduced to 
10 vulnerable tissues or hosts, however, an acute infection occurs. Related diseases include 

puerperal fever, scarlet fever, erysipelas, pharyngitis, impetigo, necrotising fasciitis, myositis and 
streptococcal toxic shock syndrome. 

Although S.pyogenes may be treated using antibiotics, a prophylactic vaccine to prevent the onset 
of disease is desired. Efforts to develop such a vaccine have been ongoing for many decades. 
1 5 While various GAS vaccine approaches have been suggested and some approaches are currently in 
clinical trials, to date, there are no GAS vaccines available to the public. 

It is an object of the invention to provide further and improved compositions for providing immunity 
against GAS disease and/or infection. The compositions are based on a combination of two or more 
(eg. three or more) GAS antigens. 

20 DISCLOSURE OF THE INVENTION 

Applicants have discovered a group of thirty GAS antigens that are particularly suitable for 
immunisation purposes, particularly when used in combinations. The invention therefore provides an 
immunogenic composition comprising a combination of GAS antigens, said combination consisting 
of two to thirty-one GAS antigens of a first antigen group, said first antigen group consisting of: GAS 

25 117, GAS 130, GAS 277, GAS 236, GAS 40, GAS 389, GAS 504, GAS 509, GAS 366, GAS 159, 
GAS 217, GAS 309, GAS 372, GAS 039, GAS 042, GAS 058, GAS 290, GAS 51 1 , GAS 533, GAS 
527, GAS 294, GAS 253, GAS 529, GAS 045, GAS 095, GAS 193, GAS 137, GAS 084, GAS 384, 
GAS 202, and GAS 057. These antigens are referred to herein as the 'first antigen group*. 
Preferably, the combination of GAS antigens consists of three, four, five, six, seven, eight, nine, or ten 

30 GAS antigens selected from the first antigen group. Preferably, the combination of GAS antigens 
consists of three, four, or five GAS antigens selected from the first antigen group. 

GAS 40 and GAS 1 17 are particularly preferred GAS antigens. Preferably, the combination of GAS 
antigens includes either or both of GAS 40 and GAS 117. Representative examples of some of these 
antigen combinations are discussed below. 
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The combination of GAS antigens may consist of three GAS antigens selected from the first antigen 
group. Accordingly, in one embodiment, the combination of GAS antigens consists of GAS 40, GAS 
11 7 and a third GAS antigen selected from the first antigen group. In another embodiment, the 
combination of GAS antigens consists of GAS 40 and two additional GAS antigens selected from the 
first antigen group. In another embodiment, the combination of GAS antigens consists of GAS 1 17 
and two additional GAS antigens selected from the first antigen group. 

* 

The combination of GAS antigens may consist of four GAS antigens selected from the first antigen 
group. In one embodiment, the combination of GAS antigens consists of GAS 40, GAS 1 17 and two 
additional GAS antigens selected from the first antigen group. In (me embodiment, the combination 
of GAS antigens consists of GAS 40 and three additional GAS antigens selected from the first antigen 
group. In one embodiment, the combination of GAS antigens consists of GAS 1 1 7 and three 
additional antigens selected from the first antigen group. 

The combination of GAS antigens may consist of five GAS antigens selected from the first antigen 
group. In one embodiment, the combination of GAS antigens consists of GAS 40, GAS 1 17 and three 
1 5 additional GAS antigens selected from the first antigen group. In one embodiment, the combination 
of GAS antigens consists of GAS 40 and four additional GAS antigens selected from the first antigen 
group. In one embodiment, the combination of GAS antigens consists of GAS 1 1 7 and four 
additional GAS antigens selected from the first antigen group. 

The combination of GAS antigens may consist of eight GAS antigens selected from the first antigen 
20 group. In one embodiment, the combination of GAS antigens consists of GAS 40, GAS 1 1 7 and six 

» 

additional GAS antigens selected from the first antigen group. In one embodiment, the combination 
of GAS antigens consists of GAS 40 and seven additional GAS antigens selected from the first 
antigen group. In one embodiment, the combination of GAS antigens consists of GAS 1 17 and seven 
additional GAS antigens selected from the first antigen group. 

25 The combination of GAS antigens may consist of ten GAS antigens selected from the first antigen 
group. In one embodiment, the combination of GAS antigens consists of GAS 40, GAS 1 17 and eight 
additional GAS antigens selected from the first antigen group. In one embodiment, the combination 
of GAS antigens consists of GAS 40 and nine additional GAS antigens selected from the first antigen 
group. In one embodiment, the combination of GAS antigens consists of GAS 1 1 7 and nine 

30 additional GAS antigens selected from the first antigen group. 

Each of the GAS antigens of the first antigen group are described in more detail below. Genomic 
sequences of at least three GAS strains are publicly available. The genomic sequence of an Ml GAS 
strain is reported at Ref. 1 . The genomic sequence of an M3 GAS strain is reported at Ref. 2. The 
genomic sequence of an Ml 8 GAS strain is reported at Ref. 3. Preferably, the GAS antigens of the 
35 invention comprise polynucleotide or amino acid sequence of an Ml, M3 or M18 GAS strains. More 
preferably, the GAS antigens of the invention comprise a polynucleotide or amino acid sequence of an 
Ml strain. 
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(J) GAS J 17 

GAS 117 corresponds to Ml GcnBank accession numbers GI: 1362 1679 and 01:15674571, to M3 
GenBank accession number Gl:21909852, to M18 GenBank accession number GI: 19745578, and is 
also referred to as 'Spy0448 f (Ml), 'SpyM3_0316' (M3), and 4 SpyM18.049r (M18). Examples of 
5 amino acid and polynucleotide sequences of GAS 1 17 of an Ml strain are set forth below: 

SEQ ED NO: 1 

MTLKKHYYLLSlJJdjVTVGAAFOT 

LGRHYSSVYYYNLRTVMGLSSEQDIBKHYBBLKNKLHDMYNHY 

SEQ ID NO: 2 

ATGACACTAAAAAAACACTATTATCTTCTCAGCCTGCTAGCT 
CAAGCCAGAGTGTCAGTGCACAAGTTTATAGCAATGAAGGGTATCACCAGCA^ 
ACACCTGCAATATAGTAAAGACAACX5CACAACTTCAATTGAGAAATATCCTTGACGG 
CTAGGGAGACACTACTCTAGCTATTATTACTACAACCTAAGAACCGT^ 
ACATTGAAAAACACTATGAAGAGCTTAAGAACAAGTTACATGATATGTACAATCATTAn 

Preferred GAS 1 17 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 1; and/or (b) which is a fragment of at least n . 
20 consecutive amino acids of SEQ ID NO: 1 , wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100 or more). These GAS 117 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 1. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 1 . Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C- terminus and/or one or more amino 
25 acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the N-terminus of SEQ ID NO: 1 . For 
example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 1 
is removed Other fragments omit one or more domains of the protein (eg. omission of a signal 
peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(2) GAS 130 

4 

30 GAS 130 corresponds to Ml GenBank accession numbers GI.13621794 and GM5674677, to M3 
GenBank accession number GI: 21909954, to Ml 8 GenBank accession number GI: 19745704, and is 
also referred to as € Spy0591 * (Ml), 'SpyM3 JM18* (M3), and 4 SpyM18^0660' (M18). GAS 130 has 
potentially been identified as a putative protease. Examples of amino acid and polynucleotide 
sequences of GAS 130 of an MI strain are set forth below: 

SEQ ID NO: 3 

MSHMKKR PEVLS PAGTLBKLKVAI DYGADAVFVGGQAYGLRS RAGNFSMEELQEG I DYAHARGAKVYVAA 
NMVTOEGNEIGAGEWFRQLRDMGLDAVIVSDPALI 

R WLAREVNMAELAE I RKRTDVB I EAFVHGAMCI S YSGRCVLSNHMSHRDANRGGCSQSCRWKYDLYDMP 
FGGERRSLKGEIPEDYSM5SVDMCMIDHIPDLIENGVDSLKIEGRMKSIHYVSTVTNCYKAAVGAYMESP 
EAFYAI KEELI DBLWKVAQRELATGFYYGI PTENEQLFGARRKI PQY KFVGEWAFDS ASMTAT I RQRNV 
IMEGDRI ECYGPGFRHFBTWKDLHDADGQKI DRAPN PMELLTI S LPREVKPGDMIRACKEGLVNLYQKD 
GTSKTVRT 

SEQ ID NO: 4 

45 ATGTCACATATGAAAAAACGTCCCGAGGTCITATCAC 
TTGACTATGGCGCAGATGCTGTTTTTGTTGGAGGGC 

-3- 
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CTCTATOJAAGAATTGCAA^ 

AACATGGTTACCCACGAAGGC^CGAAATTGGTGCGGGCOAGTGGm 

TTGATCCGGTCATTGTTTCAGATCCAGCCTTGATTCTT 

TCATTTGTCAACGCAAGCTTCATCTACCAATTACGAGACCT^ 
S CGAGTTGTTTTAGCTCGCGAGGTTAATATGGCCGAGTTAGCA 

TTGAAGCCTTTGTCCATGGAGCCATGTGTATCTCTTACT 

TCACC0TGATWCAACAGGGGCXX3CTGCTCACAGTCTO 

TTrGGAGGAGAGCGCCGCTCCTTAAAAGGGGAAATTCCAGAAGACTATTCTAT^ 

GTATGATTGftCCATATTCCTGACCTGAT^^ 
1 0 ATCT ATCC ACTAeGTCTCAACCGTAACCAACTGTTACAM 

GAAGCTTTTTATGCTATCAAAGAGGAATTGATTGACGAGTTC 

CAGGTTTTTACTATGGTATCCCAACrGAAAATCAAC^ 

TAAATTTGTCGGAGAAOTAGTTGCCTTT<^CTCAGCT 

ATCATGGAAGGCGATCGGATTGAATGTTATGGAC CAGGTTTCCGTCATTTTGAAACGGTTGTTAAGGACT 
1 5 TACATGATGCGGATGGCCAAAAGATTGACCGTGCCCCAAATCCAATGGMCT 

GAGAGAAGTTAAGCCAGGGGATATGATTAGGGCTTGCAAGGAAGGTCTGGTTAACCTCT 
GGCACCAGTAAAACTGTTAGAACATAG 

Preferred GAS 130 proteins for use with the invention comprise an amino acid sequence: (a) having 
20 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 3; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 3, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, or more). These GAS 130 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 3. Preferred fragments 
25 of (b) comprise an epitope from SEQ ID NO: 3. Other preferred fragments lack one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-tenninus of SEQ ID 
NO: 3. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, of 
a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 



30 (3) GAS 277 

GAS 277 corresponds to Ml GenBank accession numbers GI: 13622962 and GI: 15675742, to M3 
GenBank accession number GI: 2191 1206, to M18 GenBank accession number GI: 19746852, and is 
also referred to as l Spyl939 f (Ml), *SpyM3 J670' (M3), and 'SpyMlS^OO^ (M18). Amino acid 
and polynucleotide sequences of GAS 277 of an Ml strain are set forth below: 

35 SEQ ID NO: 5 

M TTOQKTI SLLSLALLIGLLGTSGKAI S VYAQ DQHTDNVI ABST^ 

VRQPTQATITLKDASDNTINSWVYTMAAQQRR 

QNKARKTPTNMQQKDTSKAMTNSVDVDTKAQTOQ 

ASNSQKNGSNKTKMLVDKBBVKPTSKRGFPWVLLGLWSLAAGLPIAIQKVSRRK 

40 

SEQ ID NO: 6 

ATGACAACTATGCAAAAAACAATTAGCTTATTATCACTAGCTTTA^ 

GCAAAGCCATATCTGTGTATGCACAAGATCAGCACACTGAT 

GGTCAGTGTTGAAGCCAGTATGCGTGGAACAGAACCTTATATTG 
45 GTCAGACAACCAACTCAGGCAACGATAACACTTAAAGACX3CTAGTGATAATACTATTAAT 

ATACTATGGCAGCGCAACAGCGTCGTTTTACAGCTTGGT^ 

TCATGTAACTGTCACCGTTCATACTCAAGAAAAGGCAGTAACTG 

CAAAACAAAGCTAGAAAAACACCAACTAATATGCAACAAAAGGATACTTCTAAA 

TCGATGTAGACACAAAAGCTCAAACAAATCAATCAGCTAACCAAGAAATAGATTCTACTTCAAATC 
50 CAGATCAGCTACTAATCATCGATCAACTTCCTTAAAGCGATCT^ 

GCTAGTAATAGCCAAAAAAACX^TAGCAACAAGACAAAAATGCTAGTGGACAAA 
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* 

CTTCAAAAAGAGGATTCCCTTGGGTCTTATTAGGTCTAGT 
TATTCAAAAAGTATCTAGACGAAAATAA 

Preferred GAS 277 proteins for use with the invention comprise an amino acid sequence: (a) having 
5 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 5; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 5, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60 v 70, 80, 90, 100, or more). These GAS 277 proteins include variants (eg. allelic 
variants, homology orthologs, paralogs, mutants, etc) of SEQ ID NO: 5. Preferred fragments of (b) 
10 comprise an epitope from SEQ ID NO: 5. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 5. For 
example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 5 
is removed. Other fragments omit one or more domains of the protein (eg. omission of a signal 
1 5 peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(4) GAS 236 

GAS 236 corresponds to Ml GenBank accession numbers GM3622264 and GI: 15675106, M3 
GenBank accession number GI: 21910321 , and to M18 GenBank accession number Gl: 19746075, 
and is also referred to as «Spyl 126' (MI), < SpyM3_0785' (M3), and 4 SpyM18 J087' (M18). Amino 
acid and polynucleotide sequences of GAS 236 from an Ml strain are set forth below: 

SEQ ID NO: 7 

MTQMNYTGKVKRVAI I ANGKYQSKRVASKLFSVFKDDPDFVLSKKNPDIVI SIGGDGMLLSAFHMYBKEL 
DKVRFVGIHTGHLGFYTDYRDFBVDKLIDNLRKDKGBQ 

KTMVADVI INHVKFESFRGDGISVSTPTGSTAYNKSLGGAVLHPTI BALQLTEI SSLNNRVFRTLGSSI I 
IPKKDKIELVPKRLGIYTISIDNKTCQUCN^ 

SEQ ID NO: 8 

ATGACACAGATGAATTATACAGGTAAGGT 

AAaCGTCGCCTCCAAACTTTTCTCCGTATTTAAAGATGATCCTGATTT 
30 GGATATTGTGATTTCTATTGGCGGAGATGGGATGCTCTTA 
GATAAGGTACGTTTTGTAGGAATCCACACCGGTCATCTTGG 
TTGATAAATTAATTGATAATTTAAGAAAAGACAAGGGAGAACAMTCTCTTATCCG 
TATTACTTTAGATGATGGTCGTGTGGTTAAAGCGCGTGCTTTC 
AAAACGATGGTAGCAGATGTTATTATTAAC(^TGTCAAACT 
35 TATCGACCCCGACAGGGAGCACAGCCTACAATAAA^ 

AGCGCTGCAATTGACGGAAATTTCCAGTCTTAATAACCGTGTC 

ATTCCCAAAAAAGATAAGATTGAGTTAGTGCOVAAACGATTAGGAATTTATACCATTTCCA 

AAACCTATCAGTTAAAAAATCTGACGAAGGTGGAGTATTTTATCGACGATGAGAAAA 

CfCTCCGAGTCATAOGAGCTTTTGGGAAAGGGTCAAGGATGCCTTTATTGGAGAGAT 

40 

Preferred GAS 236 proteins for use with the invention comprise an amino acid sequence: (a) having 

50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%. 92%, 93%, 94%, 95%, 96%, 

97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 7; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 7, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

45 30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These GAS 236 proteins include variants (eg. allelic 

variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 7. Preferred fragments of (b) 

comprise an epitope from SEQ ID NO: 7. Other preferred fragments lack one or more amino acids 

.5. 
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(e g. 1 , 2, 3, 4 9 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-tenninus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus ofSEQ ID NO: 7. For 
example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 7 
is removed Other fragments omit one or more domains of the protein (eg. omission of a signal 
5 peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(S) GAS 040 

GAS 040 corresponds to Ml GenBank accession numbers GI: 13621545 and 01:15674449, to M3 
GenBank accession number GI: 21909733, to M18 GenBank accession number GI: 19745402, and is 
also referred to as t Spy0269 > (Ml), 'SpyM3JM97' (M3), 'SpyM 18J)256' (M18) and *prgA\ GAS 
10 040 has also been identified as a putative surface exclusion protein. Amino acid and polynucleotide 
sequences of GAS 040 from an Ml strain are set forth below: 

SEQ ID NO: 9 

MDLEQTKPNQVKQKIALTSTIALLSASVGVSHQVKADDRASGBTKASNTHPDSLPKPETIQEAKATIDAV 
EKTLSQQKAELTE1ATALTKTTABINHLKEQQ 
1 5 TETELHNAQADQHS KETALS EQKAS I S AETTRAQDLVBQVKTSEQNI AKLNAMI SN PDAITKAAQTANDN 
TKALSSEI^KAKADLENQKAKVKKQLTC 

PLBELKKLEASG YI GSAS YNN YYKEHADQI I AKAS PGNQLNQYQD I PADRNR FVDPDNLTPEVQNBLAQF 
AAHMINSVRRQLGLPPVTVTAGSQEPARLLSTSYKKTHGNTRPSFVYGQPGVSGHYGVGP I EDSA 
GASGLI RNDDNM YEN IGAFNDVHTVNGI KRGI YDS I KYMLFTDHLHGNTYGHAINFLRVDKHNPNAPVYL 
20 G FSTSNVGS LNEHFVM F PESN I AKHQRPNKTP I KAVGSTKDYAQRVGTVS DT I AAI KGKVS SLENRLS AI 
HQBADIMAAQAKVSQLQGKIiASTLKQSDSLNLQTO^ 

S LKAALHQTEA1AEQAAARVTALV AKKAHLQYLRDF KLN PNRLQV I RBR I DNTKQDLAKTTS SLLN AQEA 
IAAWAKQSSLEATIATTEHQLTLLKTLANEKEYRHLDEDIATVPDLQVAPPLTGVKPLS 
QEMVKETKQLLEASARLAAENTSLVAEALVGQTSEMVASNAIVSKITSSITQPSSKTSYGSGSSTTSNLI 
25 SDVDESTQRALKAGWMIAAVGLTGPRFRKSSK 



SEQ ID NO: 10 

ATCGACTTAGAACAAACGAAGCCA^ 

TGAGTGCCA GTGTAGGCGTATCTCACCAAGTCAAAGCAGATGATAGAGCCTCAGG 

30 TAATACTCACGACGATAGTTTACCAAAACCAGAMCAATT 

GAAAAAACTCTCAGTCAACAAAAAGC^GAACTGACAGAGCTTGCTACCGCTCTGACA 
AAATCAACCACTTAAAAGAGCAGCAAGATAATGAACAAAAAGCTTTAACCTCT^ 
TAATACTCTTGCAAGTAGTGAGGAGACGCTATTAGCCCAAGGAGCCGAACATCAAAGAGAGTT^ 
ACTGAAACAGAGCTTCATAATGCTCAAGCAGATCAACATC 

35 CTAGCATTTCAGCAGAAACTACTCGAGCTCAAGATTTAGTGGAACAAGTCAAAACGTCTGAACAAAATAT 
TGCTAAGCTCAATGCTATGATTAGCAATCCTGATGCTATCACTAAAGCAGCTCA 

ACAAAAG CATT AAG C T C AGAAT TGGAGAAGG CT AAAGCTGACTT AGAAAATC AAAAAG C TAAAG TT AAAA 

AGCAATTGACTGMGAGTTGGCAGCTCAGAAAGCTGCTCTAGCAGAAAAAGATC 

TAAATCCTCAGCTCCGTCTACTCAAGATAGCATTGTGGGTAATAATACCATGAAAGCACCGCAAGGCT 

40 CCTCTTGAAGAACTTAAAAAATTAGAAGCTAGTGGTTATATTGGATCA 

AAGAGCATGCAGATCAAATTATTGCCAAAGCTAGTCCAGGTAATCAATTAAA 
AGCAGATCGTAATCGCTTTGTTGATCCCGATAATTTGACACCAGAAGTGCAA 
GCAGCTCACATGATTAATAGTGTAAGAAGACAATTAGGTCTACCACCAGTTACTCT 
AAGAATTTGCAAGATTACTTAGTACCAGCTATAAGAAAACT^ 

45 CGGACAGCCAGGGGTATCAGGGCATTATGGTGTTGGGCCTCATGATAAAACT^ 
GGAGCGTCAGGGCTCATTCGAAATGATGATAACATGTACGA^ 
CTGTGAATGGTATTAAACGTGGTATTTATGACAGT^^ 
AAATACATACGGCCATGCTATTAACTTTTTACGTGTAGATAAACATAA^ 
GGATTTTCAACCAGCAATGTAGGATCTTTGAATGAACACTTTGT^ 

50 ACCATCAACGCTTTAATAAGACCCCTATAAAAGCCGTTGGAAGTACAAAAGATTATGCC 
CACTGTATCTGATACTATTGCAGCGATCAAAGGAAAAG 

CATCAAGAAGCTGATATTATGGCAGCCCAAGCTAAAGTAAGTCAACTTCAAGGTAAACT 
TTAAGCAGTCAGACAGCTTAAATCTCCAAGTGAGACAATTAAATGATACTAAAGGTTCTT^ 



-6 



ATTACTAGCAGCTAAAGCAAAACAAGCACMCTCGAAGCTACTCGTGATCM 

TCGTTGAAAGCCGCACTGCACCAGACAGAAGCCTTAGCAGAGCAAGCW 

TOGCTAAAAMGCTCATTTGCAATATCTAAGGGACTTTAAATTGAATCCT 

TGAGCGCATTGATAATACTAAGCAAGATTTGGCTAAAACTACCT^ 
5 TTAGCAGCCTTACAAGCTAAACAAAGCAGTCTAGAAGCTACTATTGCTACCACAGAACACCAG^ 

TGCTTAAAACCTTAGCTAAGGAAAAGQAATATCGCCACTTAQACGAAGATATAGCTACT^ 

GCAAGTAGCTCCACCTCTTACGGGCGTAAAACCGCTATCATATAGTAAGATAGATACTACTCC 

CAAGAAATGGTTAAAGAAACGAAACAACTATTAGAAGCTTCAGCAAGATTAGCT^ 

TTGTAGCAGAAGCGCTTGTTGGCCAAACCTCTGAAATGGTAGCAACT 
10 ATCTTCGATTACTCAGCCCTCATCTAAGACATCTTATGGCTCAGG^ 

TCTGATCmX^TGAAAGTACTCAAAG AGCTOT 

CAGGATTTAGGTTCCGTAAGGAATCTAAGTGA 

Preferred GAS 040 proteins for use with the invention comprise an amino acid sequence: (a) having 
15 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 9; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 9, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250 or more). These GAS 040 proteins include variants 
(eg. allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 9. Preferred 

■ 

20 fragments of (b) comprise an epitope from SEQ ID NO: 9. Other preferred fragments lack one or 
more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one 
or more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ 
ID NO: 9. For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
SEQ ID NO: 9 is removed As another example, in one embodiment, the underlined amino acid 

25 sequence at the C-terminus of SEQ ID NO: 9 is removed Other fragments omit one or more domains 
of the protein (eg. omission of a signal peptide, of a cytoplasmic domain, of a transmembrane 
domain, or of an extracellular domain). 



(6) GAS 389 

GAS 389 corresponds to Ml GenBank accession numbers GI: 13622996 and GI: 15675772, to M3 
30 GenBank accession number GI: 2191 1237, to M18 GenBank accession number GI: 19746884, and is 
also referred to as 'Spyl981' (Ml), *SpyM3 _1701 % (M3), , SpyM18 - 2045 > (M18) and 'relA\ GAS 
389 has also been identified as a (p)ppGpp synthetase. Amino acid and polynucleotide sequences of 
GAS 389 from an Ml strain are set forth below: 

SEQ ID NO: 11 

35 MRNEMAKIMNVTGEEVIALAATYOT 

DAVTVACGFLHDVVEDTDITIiDEIRADFGHDARDITO 
VILVKLADRLHNMRTLKHLRKDKQER^ 

MKEKRRERBALVEAI VSKVKTYTTQQGLFGDVYGRPKHI YS I YRKMRDKKKRFDQI PDLI AI RCVMBTQS 
DVYAMVGYIHELWRPMPGRFKDYIAAPKANGYQS IHTTVYGPKGPI EIQIRTKDMHQVAEYGVAAHWAYK 

40 KGVRGKVNQAEQAVGMNW I KELVELQDASNGDAVDFVDS VKEDI FSBRI YVFTPTGAVQELPKESGPIDF 
AY AIHTQIGEKATGAKVNGRMVPLTAKLKTGDWE 1 1 TNANS FG PSRDWVKLVKTNKARNKI RQFFKNQD 
KELS VNKGRDLLVS YFQBQGYVANKYLDKKR I EAI LPKVSVKSBESLYAAVGFGDI S PI SVFNKLTEKER 
REEBRAKAKAEAEE LVKGGEVKHENKDVLKVRS ENGVI I QGASGLLMR I AKCCNPVPGDP I DG Y I TKGRG 
IAIHRSDCHNIKSQDGYQERLIEVEWDUDNSSKDYQAEIDIYGL 

45 PTKDMKFAN I HVS FGI PNLTHLTTWEKI KAVPDVYSVKRTNG 

SEQ ID NO: 12 

ATGAGGAACGAAATGGCAAAAATAATGAACGTAACAGGAGAAGAAGTCATTGCCOT 
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TGACCAAGGCTOATOTGGCTTTTGTGGCAAAGGCTTTAGCA 
C^GAAAGTCAGGCG^CCCTATATCGTCCATCCG^ 
GATGCTGTGACA GT TGCTT GT GGCTTTTTACATGATGTC 
TCOAAGCAGACTTTGGCCATGATGCTCGTGATATCGT^^ 
5 CAAATCTCATGAGGAGCAACTCGCC<1AAAACCATC(K!AAAATC 
GTG ATTTTGOTC AAATTGGCTGACCG^ 

AAGAGCGCATTTCGCGCGAAACCATGGAAATCTATGCCCCCTTGGCGCATC^ 
CAAATGGQAACTAGAAGATrTGGCTTTTCGTTACCT 
ATGAAAGAAAAACGTCGCGAGCGTGAAGCTTTGGTAGAGGCTATTGTCA^ 
1 0 CACAACAAGGGTTGTTTGGAGATGTGTATGGCCGACCAA^ 
GGACAAAAAGAAACGATTCGATCAGATTTTTGATCI^ 

GATGTCTATGCTATGGTTGGCTATATTCATGAGCTTTGGOGTCCCATGC^ 

TTGCAGCTCCTAAAGCTAATGGCTACCAGTCTATTCATACC^ 

GATTCAAATCAGAACTAAGGACATGCATCAAGTG<Kri^ 
15 AAAGGCGTGCGTGGTAAGGTCAATCAAGCTGAGCAAGCCGTTGGCATCW 

AATTGCAAGATGCCTCAAATGGCGATGCAGTGGACTTO 

ACCX^TTTATGTCTTTACACOGACAGGGGCCGTT 

GCTTATGCGATCaVTACGCAAATCGGTGAAAAAGCAACAGGTXX^ 

TCACTGCCAAGTTAAAAACAGGAGATGTGGTTGAAATCATCACCAATC 
20 AGACTGGGTAAAACTGGTCAAAACCAATMGGCTCGCAACAAA 

AAGGAATTGTCAGTGAATAAAGGCCGTGATTTGTTGGTGTCTTAT^ 

ATAAATACCTTGACAAAAAACGCATTGAAGCCATCCTTCCAAAAGTCAGTGTGM 

CTATGCAGCCGTTGGGTTTGGTGACATTAGTCCTATCAGTC 

CGTGAAGAAGAAAGGGCCAAGGCTAAAGCAGMGCTGAAGAATTGGTTMGGGC^ 
25 AAAACAAAGATGTGCTCAAGGTTOK^GTGAAAATGGAGTCATTAT 

GCGGATTGCCAAGTGTTGTAATCCTGTACCTGGTGATCCTATTGACGGCTA^ 

ATTGCGATTCACAGATCGGACTGTCATAACATTAAGAGTCAAGATGGCTAC^ 

TCGAGTGGGATTTGGACAATTCGAGTAAAGATTATCAGGCTGAAAT^ 

TGGTCTGCTTAATGATGTGCTCCAAATTTTATCAAACTCA^ 
30 CCGACCAAGGACATGAAGTTTGCTAATATTCACGTGAGCTTTGGCACT 

CTGTTGTCGAAAAAATCAAGGCAGTTC CAGATGTTTATAGCGTGAAGCGGACCAATGGCTAA 

Preferred GAS 389 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

35 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 11; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 11, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250 or more). These GAS 389 proteins include variants 
(eg. allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 11. Preferred 
fragments of (b) comprise an epitope from SEQ ID NO: 1 1. Other preferred fragments lack one or 

40 more amino acids (e.g. 1,2,3,4,5,6,7,8,9,10, 15, 20, 25 or more) from the C-terminus and/or one 
or more amino acids (e.g. 1,2,3,4,5,6,7, 8,9, 10, 15, 20, 25 or more) fromtheN-terminusof SEQ 
ID NO: 1 1 . Other fragments omit one or more domains of the protein (e.g, omission of a signal 
peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 



(7) GAS 504 

45 GAS 504 corresponds to Ml GenBank accession numbers GI: 13622806 and GI: 15675600, to M3 
GenBank accession number GI: 2191 1061, to M18 GenBank accession number GI: 19746708, and is 
also referred to as 'SpyHSl' (Ml), *SpyM3 J525\ 4 SpyM18J823' (M18) and 'fabK'. GAS 504 
has also been identified as a putative trans-2-enoyl-ACP reductase IL Amino acid and polynucleotide 
sequences of GAS 504 of an Ml strain are set forth below: 

SO SEQ ID NO: 13 
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MKTRX TBLLN I DY PI FQGGMAWVADGDLAGAVSN AGGLGI XGGGNAPKEWKAN I DRVXA I TDRPPOVN X 
MLLS PPADO I VDLVI BBGVKWTTGAGNPGKYMERLHQAOI I WP WPS VALAKRMBKLOVDA VI ABOIB 
AGGH I G KLTTMSLVRQ VVEAVS I P V I AAGG I ADG HGAAAAFM LGARAVQ I GTR FWAKB SN AHQN F KD K I 
LAAKDI DTV I S AQVVGH PVRS I KN KLTS AYAKAE KAFL I GQKTATDI BEMGAGSLRHAV I EGD WNGS VM 
5 AGQIAGLVRKEESCETILKDI YYGAARVIQNEAKRWQSVSIBK • 

SEQ ID NO: 14 

ATGAAAACACGTATTACAGAATTACTTAATATTGATTACCCCAT 
CTGATGGTGATTTAGCAGGTGCAGTTTCTAATGCTGGTGGTTTA 
1 0 CAAAGAAGTCGTTAAAGCTAATATTGATCGTGTCAAAGCTATTACTGATAGAC CT TT^ 
ATGCTTTTATCTCCTTTTGCTGATGATATCGTTGAT^ 

CAGGCGCAGGAAATCCAGGAAAGTATATGGAAAGACTGCACCAGGCGGGTATAATCQTTO 

CCCAAGCGTTGCGCTAGCCAAACGTATGGAAAAGCTTCGGGT^ 

GCTGGAGGACATATTGGCAAGTTAACGACTATGTCTTTAGT 
1 5 CTGTCATTGCGGCAGGTGGTATAGCTGATGGTCATGG 

TGTTCAMTTGGAACTCGCTTTGTTGTTGCTAAA 

TTAGCAGCAAMGATATTGATACGGTGATTTCTGCGCAGGT 

ATAAATTGACCTCAGCTTACGCTAAAGCAGAAAAAGCATTTTTM 

TGAAGAAATGGGAGCAGGATCGCTTCGACACGCTGTTATTGAAGGCGATGTAGT^ 
20 GCTGGCCAAATTGCAGGGCTTGTGAGAAAAGAAGAAAGCTC^ 

GTGCAGCTCGTGTTATTCAAAATGAAGCTAAGCGCTGGCAATCTGTTTCAATAG 

Preferred GAS 504 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

25 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 13; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 13, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These GAS 504 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 13. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 13. Other preferred fragments lack one or more amino acids 

30 (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 13. 
Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, of a 
cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(8) GAS 509 

35 GAS S09 corresponds to Ml GenBank accession numbers GI:13622692 and GI: 15675496, to M3 
GenBank accession number GI: 21910899, to M18 GenBank accession number GI: 19746544, and is 
also referred to as 'Spyl618* (Ml), 'SpyM3_1363' (M3), 'SpyMlS.^* (M18) and *cysM\ GAS 
509 has also been identified as a putative O-acetylserine lyase. Amino acid and polynucleotide 
sequences of GAS 509 of an Ml strain are set forth below: 

40 SEQ ID NO: 15 

MTKI YKT ITELVGQTPI IKLNRLI PNBAADVYVKLEAFNPGSSVKDRI ALSMI EAABAEGLI S PGDVI IB 
PTSGNTGIGLAWVGAAKGYRVI I VMPBTMSLERRQI IQAYGABLVLTPGAEGMKGAI AKAETLAI ELGAW 
MPMQFNNPANPS I HE KTTAQE I LEAF KE I S LDAFVSGVGTGGTLSGVSHVLKKAN PETV I YAVEAEESAV 
LSGQEPGPHKIQGISAGFI PNTLDTKAYDQI IRVKSKDALETARLTGAKE GFLVGI SSGAALYAAI EVAK 
45 QLGKGKHVLTI LPDNGBRYLSTELYDVPVI KTK 

SEQ ID NO: 16 

ATGACTAAAATTTACAAAACTATAAGAGAATTAGTAGGTCAAACAC 
TTCCAAACGAAGCTCCTGACGTTTATGTAAAAT^ 
50 TATTGCTTTATCGATGATTGAAGCTGCTGAAGCTGAAGGTCTG^ 
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CCAACMGTGGTAATACACK3TATTGGTCTTGCATGGCT 
TTATGCCCGAMCTATGAGCTTGGAAAGACXXKAAATCATTCA 
ACCIXXoAGCAGAAGGTATGAAAGGGGCTATTGCAAAAGCTGAAA 
ATGCCTATGCAATTTMTAACCCTGCCMTCCMGCATC 
5 MG CTm WKXfrGATTTCm 

TTCACATGTCTTGAAAAAAGCTAACCCTGAAACTGTTATCTATC 

TTATCTGGTCAAGAGCCTGGACCACATAAAATTCMGGTATATCAGCTGGATCT 
ATACCAAAGCCTATGACCAAATTATCCGTGTT^ 

AGCTAAGGAftGG CTTCCTGGTTGGGATTTCTTCTGGAG 
10 CAGTTAGGAAAAGGCAAACATGTGTTAACTATTTTACC^GATAATGGCGAACGCT 
TCTATGATGTACCAGTAATTAAGACGAAATAA 

Preferred GAS 509 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e* 60%, 65%, 70% 75% 80% 85% 90%, 91%, 92%, 93%, 94% 95% 96%, 

15 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 15; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 1 5, wherein a is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, or more). These GAS 509 proteins include variants (e g. allelic 
variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 15. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 15. Other preferred fragments lack one or more amino acids 

20 (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 1 0, 1 5, 20, 25 or more) from the C-terminus and/or one or more amino 
acids(e.g. 1,2, 3,4,5,6,7, 8,9, 10, 15,20,25 or more) from the N-terminus of SEQ ID NO: 15. For 
example, in one embodiment, the underlined amino acid sequence at the C-terminus of SEQ ID NO: 
1 5 is removed Other fragments omit one or more domains of the protein (eg. omission of a signal 
peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

25 (9) GAS 366 

GAS 366 corresponds to Ml GenBank accession numbers Gl: 1362261 2, OI: 15675424 and 
GI:30315979, to M3 GenBank accession number Gl: 21910712, to Ml 8 GenBank accession number 
01: 19746474, and is also referred to as t Spyl525' (Ml), 4 SpyM3J176' (M3), , SpyM18J542* 
(Ml 8) and l murD\ GAS 366 has also been identified as a UDP-N-acetylemuramoylalanine-D- 
30 glutamate ligase or a D-glutamic acid adding enzyme. Amino acid and polynucleotide sequences of 
GAS 366 of an Ml strain are set forth below: 

SEQ ID NO: 17 

MKVI SNFQNKKI LI LGLAKSGEAAA KLLTKLGALVTVNDSKPFDQN PAAQALLEEGI KYI CGSHPVELLD 
ENFEYMVKNPGI PYDNPMVKRAIAKEI PILTEVELAYFVSBAPI IGI TGSNGKTTTTTMI ADVLNAGGQS 
35 ALLSGNIGY PAS KWQKAI AGDTLVMELS S FQLVGVNAFRPHI AVITNLMPTHLDYHGSFEDYVAAKWMI 
QAQMTBSDYLI LNANQE I SATLAKTTKATVI PFSTQKWDGAYLKDGILYFKEQAI I AATDLGVPGSHNI 
ENALATIAVAXLSGIADDI I AQCLSHFGGVKHRLQR VGQI KDITFYKDSKSTNI1ATOKALSGFDNSRLI 
LIAGGLDRGMEFDDLVPDLLGLKQMI I LGES ABRMKRAANKAEVS YLEARNVAEATELAFKLAQTGDTI L 
LS PANASWDMYPNFBVRGDEFLATFDCLRGDA 

40 SEQ ID NO: 18 

ATGAAAGTGATAAGTAATTTTCAAAACAAAAAAATATTAATATTC 
CAGCAAAATTATTGACCAAACTTGGTGCTTTAGTGACTG 
AGCGGCACAAGCCTTGTTGGAAGAGGGGATTAAGGTCATTTGTGG 
GAGAACTTTGAGTACATGGTTAAAAACCCTGGGATTCC^ 
45 CAAAGGAAATTCCCATCTTGACTGAAGTAG^TTGGCTTATTTCGTATCT 
TACAGGATOVAACXKX3AAGACAACCACAACGACAATGATTGCCGATG 
GCACTCTTATCTGGAAACATTGGTTATCCTGCTTCAAAAGTTGT^ 
TGGTGATGGAATTGTCCTCTTTTCAATTAGTGGGAGTGAATGCT^ 
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TAATTTAATGCCGACTCACCTGGACTATCATGGCAGTTTT^ 
CAAGCTCAGATOACAGAATCAGA^ 

AGACCACCAAAGCAACAGTGATTCCTTTTTCAACT 

AATACTCTATTTTAAAGAACAGGCGATTATAGCTGCAACTGACTTAGGTCT 
5 GAAAATGCCCTAGCAACTATTGCAC5TTGCCAAGTTATCT0GTATTGCTG^ 

TTTCACATTTTGGAGGCGTTAAACATCGTTTGCAAC^^ 

TGACAGTAAGTCMCCAATATTTTAGCCACTCAAAAAGCTrrATCAGGTT^ 

TTGATTGCTCGCGGTCTAGATCGTGGCAATGAATTTG^ 

AGATGATTATITrGGGAGAATCCGCAGAGCGTATGAAGCGAGCTGCT 
10 TGAAGCTAGAAATGTGGCAGAAGCAACAGAGCTTCCTTTTAA 

CTTAGCXX!AGCCAArGCTAGCTGGGATATCTATCCTAATTTTGAG 

CCTTTGATTGTTTAAGAGGAGATGCCTAA 

Preferred GAS 366 proteins for use with the invention comprise an amino acid sequence: (a) having 
1 5 50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 17; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 17, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These GAS 366 proteins inctude variants (eg. allelic 
variants, homology, orthologs, paralogy mutants, etc.) of SEQ ID NO: 17. Preferred fragments of (b) 
20 comprise an epitope from SEQ ID NO: 1 7. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 1 0, 1 5, 20, 25 or more) from the N-terminus of SEQ ID NO: 1 7. For 
example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 
1 7 is removed Other fragments omit one or more domains of the protein (eg. omission of a signal 
25 peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(10) GAS 159 

GAS 159 corresponds to Ml GenBank accession numbers GI: 13622244 and GI: 15675088, to M3 
GenBank accession number GI: 21910303, to M18 GenBank accession number GI: 19746056, and is 
also referred to as 'Spyl 105' (Ml), 4 SpyM3J)767' (M3), 'SpyM18 J067' (M18) and 'potD\ GAS 
30 1 59 has also been identified as a putative spermidine/putrescine ABC transporter (a periplasmic 
transport protein). Amino acid and polynucleotide sequences of GAS 159 of an Ml strain are set 
forth below: 

SEQ ID NO: 19 

MRKLYSFIiAGVLGVI VI LTSLgFI LQKKSGSGSQSDKLVI YHWGDY I DPALLKKFTKBTGIEVOYBTFDS 
35 NEAMYTKI KQGGTTYDI AVPSDYTI DKMI KENLLNKLDKS KLVGMDN I GKEFLGKS FDPQNDYS LP YFWG 
TVGIVYNDQLVDKAPMHVTCDLWRPEYKNSIMLIDGARE 
PWKAIVADBMKGYMI(X3DAAIGITFSGBASEMLD 

FLN FINRPENAAQNAAYIGYATPNKKAKALLPDEI KNDPAFYPTDDI I KKLEVYDNLGSRWLGI YNDLYL 
QFKMYRK 

40 

SEQ ID NO: 20 

ATGOGTAAACTTTATTCCTTTCTAGCAGGAGTTTTGGGTGTTATTC 
TCTTGCAGAAAAAATCGGGTTCTGGTAGTCAATC 

TGATCCAGCTTTGCTCAAAAAATTCACCAAAGAAACGGGCATTGAAGTGCAGT^ 
45 AATGAAGCCATGTACACTAAAATCAAGCAGGGCGGAACCACTTACGACATTC 
CCATTGATAAAATGATCAAAGAAAACCTACTCAATAAGCTTGATA 
TATCGGGAAAGAATTTTTAGGGAAAAGCTTTGACCCACAAAACGACTA 
ACCGTTGGGATTGTTTATAATGATCAATTAGTTGATAAGG 
CAGAATATAAAAATAGTATTATGCTGATTGATGGAGCX3CGTGAAAT 
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TOGTTATAGTOTGAATTCTAAAAATCTAGAGCAGTTGCAGGCAGCCGA^ 
CCGAATGTTAAAGCCATTGTAGCAQATOAQATGAAAGGCTACA 
TTACCTTTTCTCCTX SA AGCCAGTGAGATGTTAGA 
AGGGTCTAACCTTnxnTTGATAATTTGCT 
5 TTTTTC AACTTTATCAATCCTCCTGAAA^ 

ATAAAAAAGCCAAGGCCTTACTTCCAGATGAGATAAAAAATGATCCTC 

TATCAAAAAATTGGAAGTTTATGACAATTTAGGGTCAAQ ATgGTTC 

CAATTTAAAATGTATCGCAAATAA 

10 Preferred GAS 1 59 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 19; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 19, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These CAS 159 proteins include variants (eg. allelic 

1 5 variants, homologs, orthologs, paralogs, mutants, eta) of SEQ ID NO: 19. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 19. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, IS, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 19. For 
example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 

20 19 is removed. In another example, the underlined amino acid sequence at the C-terminus of SEQ ID 
NO: 19 is removed. Other fragments omit one or more domains of the protein (eg. omission of a 
signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(11) CAS 217 

GAS 217 corresponds to Ml GenBank accession numbers GI: 13622089 and GI: 15674945, to M3 
25 GenBank accession number GI: 219 1 0174, to Ml 8 GenBank accession number GI: 1 9745987, and is 
also referred to as 'Spy0925' (Ml), *SpyM3 J>638 f (M3), and 'SpyM18 0982' (M18). GAS 217 has 
also been identified as a putative oxidoreductase. Amino acid and polynucleotide sequences of GAS 
217 of an Ml strain are set forth below: 

SEQ ID NO: 21 

30 MAQRI IVITGASGGI^QAIVKQLPKEDSLILLGRNKERLBHCYQHIDNKECLEU^ITNPVAIEKWAQIY 
QRYGRIDVLINNACTGAFKGFEEFSAQBIADMFQVNT 

SAKSS I YSATKFALIGFSNALRLELADKGVYVTTVN PGP I ATKPPDQADPSGHYLBSVGKFTLQPNQVAK 
RLVSI IGKNKREIiNLPFSLAVTHQFYTLFPKLSDYLARKVFNYK 

35 SEQ ID NO: 22 

ATOGCACAAAGAATCATTGCTATCACX3GGAGCTT 
CCAAGGAAGACAGCTTGATTTTACTAGGACGTAACAAAGAACGCCT 
CAACAAAGAATGCCTCGAGTTGGATATTACCAATCCAGTAGC 
40 CAGCGCTATGGCCGTATTGATGTCTTGATTAATAATGCrGGCTACGGA 
TTTCTGCCCAAGAAATAGCTGATATGTTTCAGGTTAAC^ 
TGGTCAGAAAATGGCAGAGCAGGGGCAAGGTCACCTTATTAATATTC 
TCAGCCAAATCGAGCATTTATTCAGCCACCAAGTTTGCCCTTAT 

AATTAGCGGATAAAGGGGTTTACGTGACCACCGTGAATC 
45 AGCTGACCCGTCTGGACATTATTTGGAAAGOGTTGGTAAATTTACTCTCCAAC 
CGTTTGGTTTCTATTATCGGGAAAAATAAACGAGAATTGAATTTC 
AATTTTACACCCTTTTCCCTAAATTATCTGATTATCT 
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Preferred GAS 217 proteins for use with the invention comprise an amino acid sequence: (a) having 
• 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 21 ; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 21, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

5 30, 35, 40, 50, 60, 70, 80, 90, 100, or more). These GAS 217 proteins include variants (eg. allelic 
variants, homology orthologs, paralogy mutants, etc.) of SEQ ID NO: 21. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 21 . Other preferred fragments lack one or more amino acids 
(c.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, IS, 20, 25 or more) from the C-teiminus and/or one or more amino 
acids (eg. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terrainus of SEQ ID NO: 21 . 

1 0 Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, of a 
cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(12) GAS 309 

GAS 309 corresponds to Ml GenBank accession numbers GI: 13621426 and GI: 15674341, to M3 
GenBank accession number GI: 21909633, to M18 GenBank accession number GI: 19745363, and is 
15 also referred to as , Spy0124 > (Ml), 'SpyM3_009r (M3), 'SpyMl8 0205' (MIS), W and *rofA\ 
GAS 309 has also been identified as a regulatory protein and a negative transcriptional regulator. 
Amino acid and polynucleotide sequences of GAS 309 of an Ml strain are set forth below: 

SEQ ID NO: 23 

MIEKYLESSIESKCQLIVLFFCTSYLPITEVABCTGLTFLQL 
20 THPFKETYLYQLYASSNVLQUiAPLI 

KI VGBBYRI RYI>I ALLYS KFGI KVYDLTQQDKNTI HS FLSHSSTHLKTSPWLSESFSF YDI LLALSWKRH 
QFS VTI PQTRI FQQLKKLFVYDSLKKSSHDI IETYCQ^ 

QYCQLFEENDTFRLLLNPI ITLLPNLKEQKASLVKALMF FSKSFLFNLQHFI PETNLFVS PYYKGNQKLY 
TSLKLIVEEWMAKLPGKRDLNHKHFHLFCHYVEQSU 
25 IDFHSYYU^DIWQIPDLKPDLVITHSQLIPF^^ 
DLTKQLT 

SEQ ID NO: 24 

TTGATAGAAAAATACTTGGAATCATCAATCGAATCAAAATGTCAGTTAATTC 
30 CTTATTTGCCAATAACTGAGGTAGCAGAAAAAACTGGCTTAAC^ 
GGAACTGAATGCCTTTTTCCCTGGTAGTC^ 
ACACATCCTTTTAAAGAAACTTATCTTTACCAACTCT 
TTTTAATAAAAAATGGTTCCCACTCTCGTCCCCHTA^ 

CTCAGCTTATCGGATGCGCGAAGCATTGATTCCTTTATTAAGAAACTTTGAAT^ 
35 AAGATTGTC(K3TGAGGAATATCGCATCCGTTACCTCATCGCTCTGCTATATAGTAAGTTTC 

TTTATGACTTGACGCAGCAAGACAAAAACACTATTCATAGCTTTCT 

AACCTCTCCTTGGTTATCGGAATCGTTTTCTTTCTATGACATTTTATTA 

CAATTTTCGGTAACTATTCCCCAAACCAG^TTTTTCAACAATTA 

TGAAAAAAAGTAGCCATGATATTATCGAAACTTACTC^^ 
40 CCTCTATTTAATTTATATCACCGCTAATAATTCTTTTGCGAGCTTACAATC 

CAATATTGTCAACTTTTTGAAGAAAATGATACTTTTCGCCTGCTTT^ 

CTAACCTAAAAGAGCAAAAGGCTAGTTTAGTAAAAGCTCTTATGTTTTTTTCAAAA 

TCTGCAACATTTTATTCCTGAGACCAACTTATTCXSTTTCTC 

ACGTCCTTAAAGTTAATTGTCX3AAGAGTGGATGGCCAAACTTC 
45 ATTTTCATCTTTTTTGCCACTATGTCGAGCAAACT 

CGTAGCCAGTAATTTTATCAATGCTCATCTCCTAACGGATTCTTTTCCAAGGTATTTCT 

ATTGATTTTOVTTCCTATTATCTATTGCAAGATAATGTTTA 

TCATCACTCACAGTCAACTGATTCCTTTTGTTCACCATGAAOT 

ATCTTTTGATGAATCGATTCTGTCTATCCAAGAATTGATGTATCA^ 
50 GATTTAACCAAGCAATTAACATAA 
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Preferred OAS 309 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 23; and/or (b) which is a fragment of at least n 

5 consecutive amino acids of SEQ ID NO: 23, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 1 8, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, or more). These GAS 309 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 23. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 23. Other preferred fragments lack one or more amino acids 
(e.g. 1,2,3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one or more amino 

10 acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the N-terminus of SEQ ID NO: 23. 
Other fragments omit one or more domains of the protein (e.£. omission of a signal peptide, of a 
cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(13) GAS 372 

GAS 372 corresponds to Ml GenBank accession numbers GI: 13622698 and GI: 15675501, to M3 
15 GenBank accession number GI: 21910905, to M18 GenBank accession number GI: 19746500 and is 
also referred to as 'Spyl625' (Ml), 'SpyM3 J369' (M3), and 4 SpyM18 J634* (M18). GAS 372 has 
also been identified as a putative protein kinase or a putative eukaryotic-type serine/threonine kinase. 
Amino acid and polynucleotide sequences of GAS 372 of an Ml strain are set forth below: 

SEQ ID NO: 25 

20 M I Q I GKLFAGR YR I LKS I GRGGMADVYLANDLI LDNEDVAI KVLRTNYQTDQVAVARFQRBARAMAB LNH 
PNIVAIRDIGEEIXKX3PLVMEYVDGADLKRYIQ 

K I LLTKEG WKVTDFG I AVAFAETSLTQTNSMLGS VHYLS PEQARGSKATIQSDI YAMGI MLFBMLTGH I 
PYDGDSAVTI AI/}HFQKPLPS 1 1 EENHNVPQALEKWI RATAKKL SDRYGSTFEMSRDLMTALS YNRSRE 
RKIIFENVESTKPLPKVASGPTASVKLSPPTPTVLTQES 
25 FSFF I VGVALFTYLI LTKPTS VKVPNVAGTSLKVAKQELYDVGLKVGKI RQ I ESDTVAEGNWRTD PKAG 
TAKRQGSS ITLYVS IGNKGFDMENYKGLDYQBAMNSLI BTYGVPKSKI KI ERI VTNEYPENTVI SQSPSA 
GDKFN PNGKS KITLSVAVSDT ITMPMVT B YS YADAVNTLTALGI DASRI KAYVPSSS SATGFVPI HS PSS 
KAIVSGQSPYYGTSLSLSDKGEI SLYLY PEETHSSSSSSSSTSSSNSSSINDSTAPGSNTBI£PSETTSQ 
TP 

30 

SEQ ID NO: 26 

ATGATTCAGATTGGCAAATTATTTGCTGGTCG 

CGGATGTTTATTTAGCAAATGACTTGATCTTGGATAATGAAGACGTTGCAA 

TTATCAAACAGATCAGGTAGCAGTTGCGCGTTTCCAACGAGAA 
35 CCCAATATTGTTGCCATCCGGGATATAGGTGAAGAAGACGGACAGCAATTTTT^ 

ATGGTGCTGACCTAAAGAGATACATTCAAAATCATGCTCCATTATCTAATAATC 

GGAAGAAGTCCTTTCTGCTATGACTTTAGCCCACCAAAAAGGAATTCT 

AATATCCTACTAACTAAGGAGGGTGTTGTCAAAGTAACTGATT^ 

CAAGCTTGACACAAACTAATTCGATGTTAGGCAGTGTTCATT^ 
40 O^GCGACGATTCAAAGTGATATTTATGCGATGGGGATTATGCT 

CCTTATGAreGCGATAGTGCTGTTACGATTGCCTTGCyUVCATTTTC 

AGGAGAACGACAATGTGCCACAAGCTTTGGAGAATGTTGTTATTC 

TCGTTACGGGTC^CCTTTGAAATGAGTCXSTGACTTAAT^ 

CGTAAGATTATCTTTGAGAATGTTGAAAGTACCAAACCCCTCCCCAAAGTGGCCTCAGGTCCCA 
45 CTGTAAAATTGTCTCCCCCTACCCCAACAGTGTTAA^^ 

AGATGCTTTACAGCCCCCCACCAAAAAGAAAAAAAGTGGTCGTTTT 

TTTTCTTTCTTTATTGTAGGTGTAGCACTCTTTACTTATCTTATACTAACT 

TTCCTAATGTAGCAGGCACTAGTCTTAAAGTTGCCAAA 

TAAAATCAGGCAAATTGAGAGTGATACGGTTGCTGAGGGAAATGT 
50 ACAGCTAAGAGGCAAGGCTCAAGCATTACGCTTTATGTGTCAATTGGAAACAAAG 
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ACTACAAAGGACTAGATTATCAAGAAGCTATGAATAGTTTGATAGAAAC^ 

AATCAAAATTGAGCCCATTGTAACTAATGAATATC 

GGTGATAAATTTAATCCAAACGGAAACmrrAAAA 

TGCCTATGGTAACAGAATATAGTTATGCAGATGCAGTCAATACCTTAACAGCTTTA 

TAGAATAAAAGCTTATOTGOCAAOCTCrACCT 

AAAGCTATTGTCAGTGGTCAATCTCCTTACTATGGAACCT 

GTCTTTACCTTTATCCAGAAGAAACACACTCTTCTAGTAGCTC 

TTCTTCAATAAATGATAGTACTGCACCAGGTAGCAACACTGAAT^ 

ACACCTTAA 



Preferred GAS 372 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 25; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 25, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

1 5 30, 35, 40, 50, 60, 70, 80, 90, 100, 1 50, 200, 250 or more). These GAS 372 proteins include variants 
(eg. allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 25. Preferred 
fragments of (b) comprise an epitope from SEQ ID NO: 25. Other preferred fragments lack one or 
more amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one 
or more amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N -terminus of SEQ 

20 ID NO: 25. Other fragments omit one or more domains of the protein (eg. omission of a signal 
peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(14) GAS 039 

GAS 039 corresponds to M 1 GenBank accession numbers GI: 1 362 1 542 and GI: 1 5674446, to M3 





• 











25 also referred to as 4 Spy0266* (Ml), 'SpyM3 J)194* (M3), and 'SpyM18_0250* (Ml 8). Amino acid 
and polynucleotide sequences of GAS 039 of an Ml strain are set forth below: 

SEQ ID NO: 27 

MDLILFIAVLWJ£LGAYLLFKW^ 
30 LYQQLTDIRDVLHRSLSDSRDRSDKRLEKINQQW 

SFDSVSKQLBSVNKGLGEMRSVAQDVGTLNKVLSNTKTRGI LGELQLGQI IEDIMTSSQYBREFVTVSGS 
SERVE YAI KLPGNGQGGYI YLPI DSKFPLEDYYRLBDAYEVGDKLAI EASRKALLAAI KRFAKDI HKKYL 
NPPETITNFGVMFLPTEGLYSEVVRNASFroSLRREBNIWA 

KI LGNVKLEFDKFGGLIAKAQKQMNTANNTLDQLI STRTNAI VRALNTVETYQDQATKSLLKMPLLEEEN 
35 NBN 

SEQ ID NO: 28 

ATGGACCTTATCTTGTTCCTTTTGGTCTTGG 

ACGGCCTTCAACATCAGCTTGCCCAAACCCTAGAAGGCAACGC 
40 CCAGTTGGATACAGCTAACAAACAACAATTGTTAGAGCTAACACAG 

CTTTACC^CAATTAACAGATATTCGTGACGTCT 

ACAAACGCTTAGAAAAAATTAACCAGCAGGTCAACCAA 

ACGTTTGGAGAAAATGCGCCAGATCGTTGAAGAAAAATTGGAAGAAACCT^ 

TCTTTCGATTCTGTATCCAAGCAACTAGAAAGTGTCAATAAAGGCTTGGGAGAAATC 
45 AAGATGTGGGTACTTTAAATAAGGTTTTGTCCAATACCAAAACACGAGGCATTT^ 

AGGCCAAATCATTGAGGATATCATGACATCAAGCCAGTACGAAAGAGAATTTGTAACGGn 

AGTGAACGCGTAGAATATGCGATTAAGCTCCCAGGAAATGGTCAAGGCGGTTAT^ 

ACTCAAAATTCCCTCTTGAAGATTATTACCGATTAGAAGATGOT 

CGAGGCTAGCCGAAAAGCACTTCTGGCAGCTATCAAACGCTTTGCCAAAGACATTCA 
50 AACCCCCCAGAGACGACCAATTTCGGAGTTATGTTCTT^ 

GAAATGCGTCTTTCTTTGATAGCCTTCGTCGGGAAGAAAATATTGT^^ 
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TGCTTTGCTOAATTCCITATCTGTTGG 
AAAATTTTAGGCAATGTCAAGTTAGAATTCGATAAATTT^^ 
TGAATACAGCTMTAATACGCTGGATCAGCTCATTT 
TACCGTTGAAACTTATCAAGACCAAGCAACAAAATCTCTCTTGAACAT^ 
S AATGAAAATTAA 

Preferred GAS 039 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 27; and/or (b) which is a fragment of at least n 

10 consecutive amino acids of SEQ ID NO: 27, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, or more). These GAS 039 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 27. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 27. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 

1 5 amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 27. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(15) GAS 042 

GAS 042 corresponds to Ml GenBank accession numbers GI:13621 559 and GI: 15674461, to M3 
20 GenBank accession number GI: 21909745, to M18 GenBank accession number GI: 19745415, and is 
also referred to as 'Spy0287» (Ml), *SpyM3JX209' (M3), and 'SpyMlSJ^S' (M18). Amino acid 
and polynucleotide sequences of GAS 042 of an Ml strain are set forth below: 

SEQ ID NO: 29 

MTKEKLVAFSQAHASPAWLQERRLAALEAI PNLELPTI ERVKFHRWNLGDGTLTBNE SLASVPDPI AIGD 
25 NPKLVQVGTQTVLEQLPMALIDKGWFSDFYTALEEIPEVIEAHPGQALAFDEDKIAAYHTAYFNSAA^ 
WPDHLEITTPIEAIFLQDSDSDVPFNKHVLVIAGKESKFTYLERFESIGNATQKISANISVBVIAQAGS 
Q I KFS AI DRLGPSVTTY I SRRGRLEKDAN I DW ALAVMNEGNVI ADFDSDL I GQGSQADLKWAAS SGRQV 
QG I DTRVTN YGQRTVGH I LQHGV I LERGTLT FNG I GH I LKDAKGADAQQES RVLMLS DQARADAN P I LLI 
DBNEVTAGHAAS IGQVDPEDMYYLMSRGLDQETABRLVI RGFU3AVI AEI PI PS VRQEI I KVLDEKLLNR 

30 

SEQ ID NO: 30 

ATGACAAAAGAAAAACTAGTGGCTTTTTCGCAAGCCCACGCTGAGCCT 

TAGCGGCATTAGAAGCCATTCO^TTTGGAATTAC 

TCTAGGAGATGGTACCTTAACAGAAAATGAAAGTCTAGCTAGTGTTC 
35 AACCCAAAGCTTGTTCAGGTAGGCACGCAAACAGTCTT^ 

GAGTTGTTTTCAGTGATTTTTATACGGC^ 

GGCATTAGCTTTTGATGAAGACAAACTAGCrcCCTACCACACTC 

TACGTTCCTGATCACTTGGAAATCACAACTCCTATTGAAGCTATTTTCT^ 

TTCCTTTTAACAAGCATGTTCTAGTGATTGCAGGAAAAGAAAGTAAGTTCACCT 
40 ATCTATTGGCAATGCCACTCAAAAGATCAGCGCTAATATCAGTGTAGAAGTGATTGCTCAA 

CAGATTAAATTCTCGGCTATCGACCGCTTAGGTCCTTCAGTGACAACCTATATTAGCC 

TAGAGAAGGATCCCAAC^TTGATTGGGCC^ 

CAGTGATTTGATTGGTCAGGGCTCACAAGCTGATTTGAAAGTTC 

CAAGGTATTGACACGCGCGTGACCAACTATGGTCAACGTACG^ 
45 TTTTGGAACGTGGCACCTTAACGTTTAACGGGATTGGT 

TCAACAAGAAAGCCGTGTTTTGATGCTTTCTGACC^ 

GATGAAAATGAAGTAACAGCAGGTCATGCAGCTTCTATC^ 

TGATGAGTCGAGGACTGGATCAAGAAACAGCAGAACGATTGGTTACT 

CGCTGAAATTCCTATTCCATCAGTCCGCCAAGAGATTATT^ 
50 TAA 
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Preferred GAS 042 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 29; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 29, wherein n is 7 or more (e£. 8, 10, 12, 14, 16, 18, 20, 25, 

5 30, 35, 40, 50, 60, 70, 80, 90, 100, 1 50, or more). These GAS 042 proteins include variants (eg. 
allelic variants, homologs, orthologs, pandogs, mutants, etc) of SEQ ID NO: 29. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 29. Other preferred fragments lack one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the N-terminus of SEQ ID 

10 NO: 29. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(16) GAS 058 

GAS 058 corresponds to Ml GenBank accession numbers GI: 13621663 and GI:15674556, to M3 
GenBank accession number GI: 21909841, to M18 GenBank accession number GI: 19745567 and is 
15 also referred to as 4 Spy0430' (Ml), 4 SpyM3_0305' (M3), and 'S|)yM18jM77' (M18). Amino acid 
and polynucleotide sequences of GAS 058 of an Ml strain are set forth below: 

SEQ ID NO: 31 

MKMSGFMKTKSKRFLNLATLCIALLGTT^ SKRDYMTRFGLGDLEDDS ANYPSNLEAR YK 

GYLEGYBKGLKGDDIPERPKIQVPEDVQPSDHGDYRDGYEEGFGEGQHKRDPLETEABDDSQGGRQEGRQ 
20 GHQEGADSSDLNVEBSDGLSVIDEVVGVIYOAFSTIOTYLSGLF 

SEQ ID NO: 32 

ATGAAATGGAGTGGTTTTATCAAAACAAA^ 

TACTAGGAACAACTTTGCTAATGGCA CATCCCGTACAGGCGGAGGTGATATCAAAAAGAGACT 
25 TCGCTTCGGGTTAGGCGATTTAGAAGATGATTCAGCTAACTATCCTC 
GGATATCTAGAGGGATATGAAAAAGGCTTAAAAGGAGAT^ 
CTGAGGATGTTCAGCCATCTGACCATGGCGACT 

ACATAAACGTGATCCATTAGAAACAGAAGCAGAAGATGATTCTCAAGGAGGACGTCAAGAAGGACGTCAA 
GGACATCAAGAAGGAGCAGATTCTAGTGATTTGAACGTTGAAGAAAGCGA 
30 AAGTAGTTGGAGTAATTTATCAAGCATTTAGTACTATTTG 

Preferred GAS 058 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 31 ; and/or (b) which is a fragment of at least n 

35 consecutive amino acids of SEQ ID NO: 31, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, or more). These GAS 058 proteins include variants (e.g. 
allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 31. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 31. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 

40 amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 3 1 . For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
SEQ ID NO: 3 1 is removed. Other fragments omit one or more domains of the protein (eg. omission 
of a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular 
domain). 
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(17) GAS 290 

GAS 290 corresponds to M I GenBank accession numbers GI: 1 3622978 and GI: 1 5675757, to M3 
GenBank accession number GI: 2191 1221 , to M18 GenBank accession number GI: 19746869 and is 
also referred to as 'Spyl9S9 f (Ml), 'SpyM3 J685* (M3), and 4 SpyM18 J026* (M18). Amino acid 
5 and polynucleotide sequences of GAS 290 of an Ml strain are set forth below: 

SEQIDNO:33 

MKHILFIVGSLREGSFNHQUu\QAQKA^ 

HI FTPVYNFSI PGSVKNLU)WLSRALDLSDPTGPSAIGGK\n^ 

AGEFTKATVNPDAWGTGRLBISKETKANLLSQAKALLAAI 

10 

SEQ ID NO: 34 

ATGAAACATATTITATTTATTGTTGGCTCGCTT 
CACAAAAAGCTCTGGAACATCAAGCAGTTGTATCTTACTTAAATTGG 
AGATATCGAAGCTAATGCACCTTTACCAGTTGTTGACGCTCGTCAAGCTG 
1 5 TGGATTTTTACACCAGTTTACAACTTCTCTATTCCAGCTTCTGTTA 
GTGCTCTTGATTTGTCTGATCCGACGGGCCC^TCTGCTATTG 
TGCAAATGGCGGGCATGATCAAGTATTTGATC^GTTTAAAGCACT 

G CAGGAGAGTTT ACAAAAG C AACTGTGAATC CTG ATGCCTGGGGAACAGG AAGGCTTGAGATTT CAAAAG 
AGACAAAAGCAAACTTGCTATCTCAGGCAGAGGCTCTTTTAGCGGCTATTTAG 

20 

Preferred GAS 290 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 33; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 33, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

25 30, 35, 40, 50, 60, 70, 80, 90, 100 or more). These GAS 290 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 33. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 33. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 33. 

30 Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, of a 
cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

« 

(18) GAS 511 

GAS 511 corresponds to Ml GenBank accession numbers GI: 13622798 and GI: 15675592, to M3 
GenBank accession number GI: 2191 1053, to M18 GenBank accession number GI: 19746700 and is 
35 also referred to as 'Spyl743' (Ml), *SpyM3J517' (M3), 'SpyM18J815* (M18) and 'accA\ Amino 
acid and polynucleotide sequences of GAS 51 1 of an Ml strain are set forth below: 

SEQ ID NO: 35 

MTDVSRILKEARDQGRLTTIJD YANLI FDD^ 

NLARN FGQ FN PEG YR KALRLMKQAEKFGR PWT F I NTAGAY PGVGAEBRGQGEAI AKN LMEM S DLKVP 1 1 
40 AI I IGEGGSGGALALAVADQVWMLENTMYAVLS PEG FAS I LWKDGSRATEAAELMKJ TAGEL YKMG I VDR 
1 1 PEHGYFSSEIVDI IKANLIEQITS LQAKPLDQLLDER YQRFRKY 

SEQ ID NO: 36 

ATGACAGATGTATCAAGAATTTTAAAAGAAGCGCGTGATCAAGGGCG 
45 ACCTTATTTTCGATGACTTTATGGAACTGCATGG 

TGGCCTAGCTTATTTGGCGGGACAACCTGTTACGGT 

AATTTGGCAAGGAATTTTGGC CAGCCCAATC (^GAAGGTTATCGTAAAGCTTTGCGCCTT ATGAAACAGG 
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cagaaaaatttggacgaccagttgttarc 
agaacgaggacagggtgaggccattgctaaaaatttgatggaaatgagt^ 
gccatcattattggtgaaggaggctctggtggtgca™ 
ttcaaaatactatgtatcoogttcttagcccagaaggcttt^ 
5 ggcgaccgaggccgctgaattgatgaaaatcacagcgggtgaactct 
attattccagaacatggttatttttcaagtcaaatcgtt^ 
taaccagtttgcaagctaagccattagaccaattattagatgagcgctaccaacgc^ 

A 

10 Preferred GAS SU proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 35; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 35, wherein it is 7 or more (eg. 8, 10, 12, 14, 16, 18,20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100 or more). These GAS 51 1 proteins include variants (eg. allelic 

1 5 variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 35. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 35. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-tenninus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-teiminus of SEQ ID NO: 35. 
Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, of a 

20 cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(19) GAS 533 

GAS 533 corresponds to Ml GenBank accession numbers GI: 1 36229 12 and GI: 1 5675696, to M3 
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also referred to as *Spyl877* (Ml), 4 SpyM3 J621* (M3), *SpyM18 J942' (M18) and € glnA\ GAS 
25 533 has also been identified as a putative glutamine synthetase. Amino acid and polynucleotide 
sequences of GAS 533 of an Ml strain are set forth below: 

SEQ ID NO: 37 

MAITVADIRREVKEKNVTFLRLMFTDIMGVMKNVBIPATKEQ 

LYPDLDTWIWPWGDG^GAVAGLICDIYTAEGKPFAGDPRGNLKRALIOiMNBIGYKSPNLGPBPEPPLPK 
30 MDDKGN PTLEVNDNGGYFDLAP I DLADNTRRE I VN I LTKMGFEVEASHHE VAVGQHE I DFKYADVLKACD 
N I Q I FKLVVKTI ARBHGLYATFMAKPKFGI AGSGMHCNMSLFDNQ 

GLMKHAYNYTAI TNPTVNSYKRLVPGYBAPVYVAWAGSNRS PLI RV PASRGMGTRLELRS VDPTANP YLA 
LAVLLEAGLDGI I NKI EAPEPVEAN I YTMTNEBRNEAGI I DLPSTLHNALKALQKDDWQKALGYHI YTN 
PLEAKR I EWSS YATFVSQWE I DHYI HNY 

35 

SEQ ID NO: 38 

ATGGCAATAACAGTAGCTGACATTCGTCGTGAAGTCAAAG^ 

TCACTGATATCATGGGCGTTATGAAAAATGTGGAGATTCCTGCAACTAAAGAACAGTTAGACAAAGTATO 
GTCTAACAAGGTTATGTTTGATGGTTCATCTATCX3AAGG 
40 CTTTACCCCGATTTAGACACTTGGATTGTTTTTCCCTGGGGAGATG 
TTTGTGATATTTATAraGCAGAAGGAAAGCCT^ 

GAAACAC ATGAACGAGATCGGCTACAAATCATTTAATCTTGGAC CAGAACC AGAATTTTTC CTTTTTAAG 
ATGGATGATAAAGGTAATCCGACACTTGAAGTTAACGATAATGGTGGTTATTTTGATT^ 
ACTTAGCAGACAACACGCGCCGTGAAATTGTGAATATTTTAACGA 
45 TCATCATGAAGTGGCTGTTGGTCAACATGAGATTGATTTTA 

AATATTCAAATTTTTAAGCTAGTTGTAAAAACGATTGCCCGTGAACAT^ 

CTAAACCAAAATTTGGAATAGCTGGATCAGGGATGCACTGTAACATC 

TAATGCTTTTTATX^TGAAGCTGATAAGOSAGGGATGCA^ 

GGACTAATGAAGCATGCTTATAACTACACTGCTATCACTAACCCTACAGTGAATTCTTATAAACGATTAG 
50 TTCCAGGTTATGAGGCACCTGTTTATGTCGCTTGGGCTGGAAGTAATCG 
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AGCATCACGTGCTATGGGAACGCGTTTCCftGTTACGTT C ra 

ttggctgttctcrttggaagctgga 
ctaacatttataccatc^caatggaagaacgaaat^ 

TAATGCCTTAAAAGCTCTTCAAAAAGA1X2ATCnX}GTACAA 
5 TKTl AGAAGCAAAACGAATTGAATGGTCTTCCTATGCAA CT ITfCT T T CTCAATGGGAAATTGACCATT 
ATATTCATAATTATTAG 

Preferred GAS 533 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

10 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 37; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 37, wherein n is 7 or more (e.$. 8, 10, 12, 14, 16, 18, 20, 25, 
30. 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 533 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 37. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 37. Other preferred fragments lack one or more amino 

1 5 acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 37. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(20) GAS 527 

20 GAS 527 corresponds to M 1 GenBank accession numbers GI: 1 3622332, GI: 1 5675 169, and 

GI:2421 1764, to M3 GenBank accession number GI: 21910381, to Ml 8 GenBank accession number 
GI: 19746136, and is also referred to as l Spyl204' (Ml), 'SpyNDJMS' (M3), «SpyM18J 155' 
(M18) and *guaA\ GAS 527 has also been identified as a putative GMP synthetase (glutamate 
hydrolyzing) (glutamate amidotransferase). Amino acid and polynucleotide sequences of GAS 527 of 

25 an Ml strain are set forth below: 

* 

SEQ ID NO: 39 

MTEI SILNDVQKI I VLDYGSQYNQLI ARRIREFGVFSELKSHKITAQBLRE INPIGIVLSGGPNSVYADN 
AFGI DPE I FELG I P I LOI CYGMQLI THKLGGKWPAGQAGNRE YGQSTLHLRETS KLFSGT PQEQLVLMS 
HGDAVTE I PEG FHLVGDSNDC P Y AAI ENTE KNLYGI QFH PBVRH S VYGNDI LKNFAI S I CGARGDWSMDN 
30 FIDMEI AKI RETVGDRKVLLGLSGGVDS SVVGVXJjQKAIGDQLTC I FVDHGLLRKDEGDQVMGMLGGKFG 
LN 1 1 RVDASKRFLDLLADVED PEKKRKI I GNEFVYVFDDEASKLKGVDFLAQGTL YTD 1 1 E SGTETAQT I 
KSHHNVGGLPEDMQFELIBPLNTLFKDEVRA1/3IALGMPEEIVWRQPFPGPGLAIRW 
ESDAI LRBE I AKAGLDRDWQ Y FTVNTGVRSVGVMGDGRTYDYT I A 
I STRI VNBVDHVNRI VYDI TSKPPATVEWE 

35 

SEQ ID NO: 40 

ATGACTGAAATTTCAATTTTGAATGATGTTCAAAAAATTATrc 

AGCTTATTGCTAGACGTATTCGAGAGTTTGGTGTTTTCTCCGAACTAAAAAGCCATA 

AGAACl^CGTGAGATCAATCCC^TAGGTATCGTTTTATCAGGAGGGCCTAACT 
40 GCCTTTGGCATTGACCCTGAAATCTTTGAACTAGGGATTCCGA 

TAATCACCCATAAATTAGGTGGTAAAGTTGTTCCTGCTGGACAAGCTGGTAATCGTGAATACGCT 

AACCCTTCATCTTCGTGAAACGTCAAAATTATTTTCAGGCACACCTCAAGAACAACT 

CATGGTGATGCTGTTACTGAAATTCCAGAAGGTTTCCACCTTGTT^ 

CAGCTATTGAAAATACTGAGAAAAACCTTTACGGTATTCAGTTCC^^ 
45 TGGAAATGACATTCTTAAAAACTTTGCTATATCAATTTGTGGCG 

TTTATTGACATGGAAATTGCTAAAATTCGTGAAACTGTAGGTC 

GTGGAGTTGATTCTTCAGTTCTTGGTGTTCTACTT 

CGTTGATCACGGTCTTCTTCGTAAAGACGAGGGCGATCAAGTTATGGG 

CTAAATATTATCCGTGTGGATGCITCAAAACXSTTTC 
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AAAAACGTAAAATTATTGGTAATGAATTTGTCTATGTTT^ 
TQACTTCCTT G CCCAAGGAACACTTTATACTGATATCATTGAGTCA^ 
AAATCACATCACAATGTGGGTGGTCTCCCCGAAGACATC 
TTTTCAAAGATGAAGTTCGAGCCCTTGGAATCGCTC^ 
5 ATTTCCAGGTCCTGGACTTGCTATCCGTGTCATGGGAGC^ 
GAATCAGACGCTATCCTTCGTGAAGAAATTGCT^ 
CAGTTAACACAGGTGTCCGTTCTGTAGGOGTCATOGGAGATGOTCCn'ACn 
TCGTGCTATTACGTCTATTGATGGTATGACAGCTGACTTTGCTCAAC^ 
ATCTCAACACGTATCGTAAATGAAGTTGACCACGTTAACCGTATCGTCT 
10 CCGCAACAGTTGAATGGGAATAA 

Preferred GAS 527 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 39; and/or (b) which is a fragment of at least n 

1 5 consecutive amino acids of SEQ ID NO: 39, wherein n is 7 or more (eg. 8, 10, 1 2, 1 4, 1 6, 1 8, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 527 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 39. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 39. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 

20 amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 1 0, 1 5, 20, 25 or more) from the N-terminus of SEQ ID 

NO: 39. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(21) GAS 294 

GAS 294 corresponds to Ml GenBank accession numbers GI:1 3622306, GL15675145, and 
25 GI:26006773, to M3 GenBank accession number GI: 21910357, to M18 GenBank accession number 
GI: 197461 1 1 and is also referred to as 'Spy 1 173* (Ml), 'SpyM3_0821' (M3), 4 SpyM18 J 125* 
(Ml 8) and 'gid'. GAS 294 has also been identified as a putative glucose-inhibited division protein. 
Amino acid and polynucleotide sequences of GAS 294 of an Ml strain are set forth below: 

SEQ ID NO: 41 

30 MSQSTATY I NVIGAGLAGSEAAYQI AKRGI PVKLYEMRGVKATPQHKTTN FAELVCSNS FRGDSLTNAVG 
LLKEEMRRLDSI IMRNGEANRVPAGGAMAVDREGYAESVTAELENHPLIEVTRGBITEI PDDAITVI ATG 
PLTSDALAEKI HALNGGDGFYFYDAAAPI I DKSTIDMSKVYLKSRYDKGEAAYLNC PMTKEEFMAFHEAL 
TTABEAPLNAFEKEKYFEGCMPI EVMAKRGI KTMLYGPMKPVGLEYPDDYTGPRDGEFKTP YAWQLRQD 
NAAGSL YN I VGFQTHLKWGEQKRVFQMI PGLENAEFVR YGVMHRNS YMDS PNLLTETFQSRSN PNLFFAG 

35 QMTGVEGYVESAASGLVAGINAARLFKRBEALI FPQTTA1GSLPHYVTHADSKHFQPMNVNFGI I KELEG 
PRIRDKKERYEAIASRALADLDTCLASL 

■ 

SEQ ID NO: 42 

TTGTCTCAATCAACTGCAACTTATATTAATGTTATTGGAGCT 
40 AGATTGCTAAGCGCGGTATCCCCGTTAAATTGTATGAAATGCGTGGTGTCAAAGC 

AACCACTAATTTTGCCGAATTGGTCTGTTCCAACTCATTTC 

CTTCTCAAAGAAGAAATGCGGCGATTAGACTCCATTATTATGCGTAATGGTG 

CTGGGGGAGCAATGGCTGTTGACCGTGAGGGGTATGCAGAGAGTGTC^ 

TCTCATTGAGGTCATTCGTGGTGAAATTACAGAAtt 
45 CCGCTGACTTCGGATGCCCTGGCAGAAAAAATTCACGCGCT 

ATG<^GCAGCGCCTATCATTGATAAATCTACCATTGATATGAGC 

TAAAGGCGAAGCTGCTTACCTCAACTGCCCTATGACCAAAGAAGAATTC^ 

ACAACCGCAGAAGAAGCCCCGCTGAATGCCTTTGAAAAAGAAAAGTAT 

AAGTTATGGCTAAACGTGGCATTAAAACCATGCTTTATGGACCTATGAAACCOT 
50 AGATGACTATACAGGTCCTCGCGATGGAGAATTTAAAACGCCATATGCCGTCGTGCAATTG^ 

-21- 



/till MU.ii«.vvuy w* 



AATGCAGCTGGAAGCCTTTATAATATCGTTGGTTTC 
r rnt, > CAAATGATTCCA(XjGCTTGAAAATGCTGA GT TT G 
TATG6ATTCACCAAATCITTTAACCGAAACCTTCCMTCTCGGM 
CAGATGACTGGAGTTGAAGGTTATCTCGAATCAGCT GCT 
5 G T T 'T G TT CA AAAGAGAAGAAGCACTTATTTTTCCTCAGACAA 
GACTCATGCCGACAGTAAGCATTTCCAACCAATGAAra 
CCACGCATTCGTGACAAAAAAGAACGTTATGAAGCTATTGCT^ 
GCTTAGCGTCGCTTTAA 

« 

10 Preferred GAS 294 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 41; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 41, wherein n is 7 or more {e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 294 proteins include variants (eg. 

1 5 allelic variants, homologs, orthologs, paralogy mutants, etc) of SEQ ID NO: 41 . Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 41 . Other preferred fragments lack one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N -terminus of SEQ ID 
NO: 41. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 

20 of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(22) GAS 253 

GAS 253 corresponds to Ml GenBank accession numbers Gl: 1362261 1, GI: 15675423, and 
GI:213627I6, to M3 GenBank accession number GI: 2191071 1, to M18 GenBank accession number 
GI: 19746473 and is also referred to as 4 Spyl524' (Ml), *SpyM3J 175' (M3), «SpyM18 J541 • 
25 (M18) and 'murG'. GAS 253 has also been identified as a putative undecaprenyl-PP-MurNAc- 
pentapeptide-UDPGlcNAc GlcNAc transferase. Amino acid and polynucleotide sequences of GAS 
253 of an Ml strain are set forth below: 

SEQ ID NO: 43 

MPKKI LFTGGGTVGHVTLNLI LIPKFI KDGWEVHY I GDKNGI BHTE I EKSGLDVTFHAI ATGKLRR YFSW 
30 QNLADVFKVA1X3LLQSLFIVAKLRPQALFSKGGFVSVPPW 

TTMYTTFEQBDQLSKVKHLGAOTKVFKDANQMPE I SD 

HPBLKQRYNIINITGDPHLNELSSHLYRVDYVTO 

GKEASRGDQLEN ATYFB KRG YAKQLQE PDLTLHNFDQAMADLFEHQAD YEATMLATKE I QS PDF FYDLLR 
ADISSAIKEK 

35 

SEQ ID NO: 44 

ATGCCTAAGAAGATTTTATTTACAGGTCX5TGGAACTGTAGGTCATC 
CAAAATTTATCAAGGACGGTTGGGAAGTACATTATATTGGTGATAAAAAT 
TGAAAAGTCAGGCCTTGACGTGACCTTTCATGCTATCGCGAa 
40 CAAAATCTAGCTGATGTTTTTAAGGTTGCACTTCGCC^ 
GCCCTCAAGCCCTTTTTTCCAAAGGTGGTTTTGT 

TAAACCAGTCTTTATTCATGAATCAGATCGGTCAATGGGACTAGCAAAC^ 
ACTACCATGTATACCACTTTTGAGCAGGAAGACCAGTTGTCTAAAGTTA 
AGGTTTTCAAAGATGCCAACCAAATGCCTGAATCAACTCAGCT 
45 AGACCTAAAAACCCTCTTGTTTATTGGTGGTTCGGCAGGGGCX5CAT 

CATCCAGAATTGAAGCAACGTTATAATATCATCAATATTACAGGAGACCCTCACC^ 
CTCATCTGTATCGAGTAGATTATGTTACCGATCTCTACCAACCTTT^ 
GACAAGAGGGGGCTCTAATACACnTrTTGAGCTACTGGCAATGGCTAAGCT^ 
GGTAAAGAAGCTAGCCGTGGCGATCAGTTAGAAAATGCCACTTATTTTGAGA^ 
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AATTACAGCAACCTQATTTAACTTTGCATA ATTI TGA TCA<y?CAATGGCTG AT TT GT TTOAACATCAGGC 

TGATTATGAGGCTACTATGTTGGCAACTAAGGAGATTCAGTCACCGGA 

GCTGATATTAGCTCCGCGATTAAGGAGAAGTAA 

S Preferred GAS 253 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99!5% or more) to SEQ ID NO: 43; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 43, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 253 proteins include variants (eg. 

10 allelic variants, homology orthologs, paralogy mutants, etc.) of SEQ ID NO: 43. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 43. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 43. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 

15 of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(23) GAS 529 

GAS 529 corresponds to Ml GenBank accession numbers GI: 13622403, GI: 15675233, and 
GI:21759132, to M3 GenBank accession number GI: 21910446, to M18 GenBank accession number 
GI: 19746203 and is also referred to as ^1280' (Ml), 'SpyM3_0910' (M3), 'SpyM18 J228' 
20 (Ml 8) and 'glmS'. GAS 529 has also been identified as a putative L-glutamine-D-fructose-6- 
phosphate aminotransferase (Glucosamine-6-phophate synthase). Amino acid and polynucleotide 
sequences of GAS 529 of an Ml strain are set forth below: 

SEQ ID NO: 45 

MCGIVGWGNWATDILMQGLEKLEYRGYDSAGI 
25 TRWAraGQSTEDNAHPHTSQTGRFVLVHNGVIE 

SVLBAFKKSI^IIEGSYAFALMDSQATDTIYVAKNKSPLLI 
ELVILTKDKVTVTDYDGKELIRDSYTAELDLSD^ 

DPA I ITS I QEADRL Y I LAAGTS YHAG FATKNMLEQLTDTP VBLGVASEWG YHM PLLS KKPMFI LLSQSGE 
TADSRQVLVKANAMGIPSLTVTrNVPGSTLS 
30 NGKQEALDFNLVHELSLVAQS I EATLS EKDLVAEKVQALLATTRNAFYIGRGNDYYVAMEAALKL KB I S Y 
IQCEGFAAGBLKHGTISLIEEDTPVIALISSSOLVASHTRGNIQEVAARGAHVLTVVEEGLDREGDDIIV 
NKVHPFLAPI AMVI PTQLI AYYASLQRGLDVDKPRKIAKAVTVE 

SEQ ID NO: 46 

35 ATGTGTGGAATTGTTGGAGTTGTTGGAAATCGCAATGCAACG 

TTGAATACCGGGGTTATGATTCAGCAGGAATTTTTGTGGCT 

AGTGGGGCGGATTGCTGATTTGCGTGCCAAGATTGGCATTGATGTTC 

ACCCGTTGGGCMCGCATGGCCAATOUCAGAGGA^^ 

TTGTACTTGTTCATAATGGTGTGATTGAAAATTAC^ 
40 TTTTAAGGGGCAGACAGATACTGAGATTGCAGTACACTTGATTGGAAAATTTGTC 

TCAGTACTGGAAGCTTTTAAAAAATCTTTAAGCATTATTGAAGGTTC 

GCCAAGCAACTGATACTATTTATGTGGCTAAAAACAAGTCTCC 

CAACATGGTTTGTTCAGATGCCATGGCCATGATTCGTC 

GAGCTAGTTATTTTAACC^AAGATAAGGTAACTGTTACAGACTACGATGGTAAA 
45 CCTACACTGCTGAATTAGACTTATCTGATATTGGCAAAGGGACT^^ 

TGATGAGCAACCAACCXJTAATGCGTCAATTAATTTCAACTTAT^ 

GATCCGGCTATCATTACCTCTATCCAAGAGGCTGACCGTCTTTATATTTTAGCGGCAGGGACT 
ATGCTGGTTTTGCAACAAAAAATATGCTTGAGCAATTGA 

TGAGTGGGGTTAC CACATGCCTCTGCTT AGCAAGAAAC CAATGTTTATTCTACTAAGCCAATCAGGAGAA 
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ACCGCAGATAGTCGTCAAGTTTTACTAAAGGCAAAT^ 
TTCCAGGATCAACCTTATCACQTQAAGCAACATACACCATGTTGATTCATG 

TOCGTCTACAAAAGCTTACACTGCACAAATTGCTGCCCTTGCCr rTTTGG CTAAGGCAGTTGQTQAGGCA 
AATGGTMGCAAGAAGCTCTTOACTTTAACTTGCT 
5 OQACTTT GTCTG AAAAAGATCTCGTGGCAGAAAAGGTTCAAGCT^ 
TTACATCGGGCGTGGCAATGATTATTACGTTGCGATGGAAGCTGC^ 
ATTCAATGCGAAGGCTTTGCGGCTGGTGAATTGAAACA 
CAGTAATCGCTTTAATATCGTCTAGTCAGTTGGTTGCCTCT 

TGCCCGTGGGGCTCATGTTTTAACAGTTGTGGAAC^ • 
1 0 AATAAGGTTCATCCTTTCCTAGCCCCGATTGCTATGGTCA 
CATTACAACGTCGACTTY^TtnTGATAAGCCAC^ 

Preferred GAS 529 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
1 5 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 45; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 45, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 529 proteins include variants (eg. 
allelic variants, homologs, oithologs, paralogy mutants, etc.) of SEQ ID NO: 45. Preferred fragments 

* 

of (b) comprise an epitope from SEQ ID NO: 45. Other preferred fragments lack one or more amino 
20 acids (e.g. I , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 45. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(24) GAS 045 

25 GAS 117 corresponds to M3 GenBank accession number GI: 21909751, M18 GenBank accesion 

number GI: 19745421 and is referred to as 4 SpyM3J)215* (M3), 4 SpyM18j>ppA' (M18) and «oppA\ 
GAS 045 has been identified as an oligopeptide permease. Amino acid and polynucleotide sequences 
of GAS 045 from an Ml strain are set forth below: 

SEQ ID NO: 47 

30 VTFMKKS KWLAAVSVAI LS VSALAA CGNKNASGGSEATKTYKYVPVNDPKSLDYI LTNGG 
GTTDVI TQMVIX3LLENDEYGNLVPS 1AKDWKVS KDGLT YTYTLRK5VSWYTADGEB YAPV 
TAEDFVTGLKHAVDDKSDALYWEDS I KNLKAYQNGEVDFKEVGVKALDDKTVQYTLNKP 
ESYWNSKTT^SVLPPVNAKFUCSKGKDFGTTDPSSILVNGAYPLSAPTSKSSMEFHKNEN 
YWDAKNVGI E SVKLTYSDGSDPGS F YKN FDKGE FSVARLY PND PTYKS AKKNYADN I TYG 

35 MLTGDIRHLTWNLNRTSFKNTKKDPAQ 

DAKTKALRNMLVPPTFVTIGESDFGSEVEKEMAKl^ 

FAKAKEALTAEGVTFPVQLDYPVDQAKAATVQEAQSPKQSVEASLGKENVIVNVIjETBTS 
THEAQGFYAETPEQQDYDIISSWWGPDYQDPRTYLDIMSPVGGGSVIQKLGIKAGQNKDV 
VAAAGLDTYQTLLDEAAAITDDNDARY 

40 FSGGFSWAGSKGPLAYKGMKLQDKPVTVKQYEKAKEK^ 

SEQ ID NO: 48 

GTGACTTTTATGAAGAAAAGTAAATGGTTGGCAGCTGTAA 
TCCGCTTTGGCAGCT TGTGGTAATAAAAAT^ 
45 TACAAGTACGTTTTTGTTAAOSATCCAAAATaTTGGATTA 
GGAACGACTGATGTGATAACACAAATGGTTGATGGTCT^ 

AATTTAGTACCATCACTTGCTAAAGATTGGAAGGTTTCAAAAGACGGTCTGACTTATACT 
TATACTCTTCGCGATGGTGTCTCTTGGTATACGGCTGATGGTGAAGAATATGCCCCAGTA 
ACAGCAGAAGATTTTGTGACTGGTTTGAAGCACGCGGTTGACGATAAATCAGAT^ 
50 TACX3TTGTTGAAGATTCAATAAAAAACTTAAAGGCTTA 

AAAGAAGTTCMTGTCAAAGCCCTTGACGATAAAACTGTTCAGTATACTTTGAACAAGCCT 
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QAAAGCTACTOGAATTCAAAAACAACTTATAGTGTGCTTrTCCCAGTTA^ 
TTGAACTCAAAAGGTAAAGATTTTCXTTACAACCGAT 
GCTTACTTCTTGAGCGCCTTCACCTCAAAATC^ 
TACTGGGATGCTAAGAATGTTGGQATAGAATCTGTTAAATTGACTTACT 
5 GACCCAG(HTCGTTCTACAAGAACTTTGACAA 

CCAAATGACCCTACCTACAAATCAGCTAAGAAAAACTATGCTGATAACATTACTTA 
ATGTTCACTGGAGATATCCGTCATTTAACATGGAATT^ 

ACTAAGAAAGACCCTGCACMCAAGATGCCGGTAAGAAAGCTCTTAACAACAAOGATn 

CGTCAAGCTATTCA G T TTG C TTT T G ACCGAGCGTCATT^ 
10 GATGCCAAAACAAAAGCCTTACGTAACATGCTTGTCCCACCAACATTT^ 

GAAAGTGATTTTGGTTCAGAAGTTGAAAAGGAAATG^ 

GACGTTMCTTAGCTGATGCTCAAGATGGTTTCTATAATCCIX^ 

TTTGCAAAAGCCAAAGAAGCTTTAACAGCTGAAGGTGTAACCTTC 

TACCCTGTTGACCAAGCAAACGCAGCAACTGTTC 
15 GTTGAAGCATCTCTTGGTAAAGAGAATGTCATTGTCAATGTTCTT^ 

ACTCACGAAGCCCAAGGCTTCTATGCTGAGACCCCAGAACAACAAGACTACCa^TATCATT 

TCATCATGGTGGGGACCAGACTATCAAGATCCACGGACCTACCTTGACATCATGAGTCCA 

GTAGGTGGTGGATCTGTTATCCAAAAACTTGGAATCAAAGCAGGTCAAAATAAGGAT^ 

GTGGCAGCTGCAGGCCTTGATACCTACCAAAC TC TTCTTG A TGAAG 
20 GACGACAACGATGCGCGCTATAAAGCTTACGCAAAAGCACAAGCCTACCTTACAGATAAT 

GCCGTAGATATTCCAGTTGTGGCATTGGGTGGCACTCCACGAGTTACTAAAGCCGTTCCA 

TTTAGCGGGCK5CTTCTCTTGGGCAGGGTCTAAAGGTCCT 

CTTCAAGACAAACCTCTCACAGTAAAACAATACGAAAAAGCAAAAGAA 

GCAAAGGCTAAGTCAAATGCAAAATATGCTGAGAAGTTAGCTGATCAra 

25 

Preferred GAS 045 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 47; and/or (b) which is a fiagment of at least n 
consecutive amino acids of SEQ ID NO: 47, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 

30 30, 35, 40, 50, 60, 70, 80, 90, 100, 1 50, 200 or more). These GAS 045 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 47. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 47. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1,2,3,4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the N-terminus of SEQ ID 

35 NO: 47. For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
SEQ ID NO: 47 is removed. Other fragments omit one or more domains of the protein (eg. omission 
of a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular 
domain). 

(25) GAS 095 

40 GAS 095 corresponds to Ml GenBank accession numbers GI: 1 3622787 and GI: 1 5675582, to M3 
GenBank accession number GI: 2191 1042, to M18 GenBank accession number GI: 19746634 and is 
also referred to as ^1733' (Ml), 'SpyM3J506* (M3), 'SpyM18J74r (M18). GAS 095 has also 
been identified as a putative transcription regulator. Amino acid and polynucleotide sequences of 
GAS 095 of an Ml strain are set forth below: 

45 SEQ ID NO: 49 

M KIGKKIVLMFTAIVLTTVLALGVYLTSACT 
SSERASKWEGNSDSMILVTVNPKTKKTTMTSL^ 

VQDLLN I TI DNYVQINMQGLI DLVNAVGGI TVTNEFDFPI S I AENEPEYQATVAPGTHKINGEQALVYAR 
MRYDDPBGDYGRQKRQREVIQKVLKKI LALDS I SS YRKI LSAVSSNMQTN I EI SSRTI PSLLGYRDALRT 
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I KTYQLKGEDATLSDOGS YQI VTSKHLLE I QNR I RTBW5LHKVNQLKTNATVYKNLYGSTKSQTVNNNYD 

SSGQAPSYSDSHSSYANYSSCTOTGQSASTDQDST^ 

NPOT 

S SEQ ID NO: SO 

ATCAAAATTG^ 

TCTATCTAACTAGTGCTTATACCTTCTCAA CAGGAC^ 

TTCAAACAAAAGTGATGCCATTAAACAAACAAGAG CTrrrTC TATCTTC 

TCTTCAGAGCGTGCCTCCMGTGGCAAGGAAACACTGATT 
1 0 CCAAGAAAACAACTATGACTAGTTTAGAACGAGATACCT^ 

AATGAATGGTGTTGAAGCTAAGCTTAACGCTGCW 

GTGCAAGATCTTTrGAATATCACCATTGATAACTATGTTCAAAT^ 

TGAATGCAGTTGGAGGGATTACAGTTACAAATGAGTTTGATTTTCCT 

TGAATATCAAGCTACTGTTCCGCCTGGAACACACAAAATTAAC^ 
1 5 ATGCGTTATGATCATCCTGAGGGAGATTATGGTCGACAAAAGCGTCAA 

TGAAAAAAATCCTTGCTCTTGATAGCATTAGCTCTTAT 

GCAAACGAATATCGAAATCTCTTCTCGCACTATCCCTAGT 

ATTAAGACTTATCAACTAAAAGGAGAAGATGCCACTTTATCAGATGG 

CTAATCATTTGTTAGAAATCCAAAATCGTATCCGAACAGAATTAG^ 
20 AACAAATGCTACTCHTTATGAAAATTTGTATGGGTCAACT 

TCTTCAGGCCAGGCTCCATCTTATTCTGATACT 

CCGGCCAG AGTGCTAGTACAGACCAGGACTCTACTGCTTCAAGC CATAGGC CAGCTACGCCGTCTTCTTC 

ATCAGATGCTTTAGCAGCTGATGAGTCTAGC^ 

AACCCTCAGACCTAA 

25 

Preferred CAS 095 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 49; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 49, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

30 30, 35, 40, 50, 60, 70, 80, 90, 100, 1 50, 200 or more). These GAS 095 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 49. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 49. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 

35 NO: 49. For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
SEQ ID NO: 49 is removed. Other fragments omit one or more domains of the protein (eg. omission 
of a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular 
domain). 



(26) GAS 193 

40 GAS 193 corresponds to Ml GenBank accession numbers GI: 13623029 and GI:15675802, to M3 
GenBank accession number GI: 2191 1267, to M18 GenBank accession number GI: 19746914 and is 
also referred to as 4 Spy2025' (Ml), *SpyM3J731' (M3), 'SpyM18_2082' (M18) and 4 isp\ GAS 193 
has also been identified as an immunogenic secreted protein precursor. Amino acid and 
polynucleotide sequences of GAS 193 of an Ml strain are set forth below: 

45 SEQ ID NO: 51 

MKKRKLIAVTLLSTILLNSAVPLWAOTSU^ 
KDHKPSHTHPTPPSOT)TKQTI)QASSEATO^ 
PDQQKDQTPDCTPEKSADKTPEKGPEKATDKTPEPNRDAPKPIQPP 
RSSAAYVRHWTGDSAYTHNLLSRRYGITAEQLDGFL^ 
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AMAESSLGTQGVAKEKGANMFG YGAFDFN PNNAKKYSDEVA I RHMVEDTI I ANKNQTFBRQDLKAKKWSL 
GQLDTLI DGGVY PTOTSGSGQRRADI MTKLDQMI DDKGSTPB I PBHLKI TSGTQPSBVPVG V KRSQPQNV 
LTY KSBTYS PGQCTWY A YNR VKELG YQ VDRYW^GGDWQR K PG FVTTHKPKVG YVVS PA PGQAGAOATYG 
HVAWBOI KBDGS I LI SBSNVMGU5T I S YRTFTAEQASLLTYWGDKLPR P 

5 

SEQ ID NO: 52 

ATGMGAAAAGGAMTTGTTAGCAGTAACACTATTMGTACCATACTCn 
TTGTTGCTGATACCTCCTTGCGTAA^ 

GGATGACGAGAGTGAAACACCAAAAAAAGACAAAAAAAGCAAGGAMCAGCGTCG 
1 0 AAAGACCATAAGCCATCACACACTCACCCAACCCCCCCTTCAAATGATACT 

CATCTGAAGCTACTGACAAACCAAATAAAGACAAAAACGACACCAAGCAACCAGACAGCA 
CACCCCATCTCCX^AAGACCAGTC^^ 

CCTGATCAGCAAAAAGATCAGACACCTGATAAAACACCAGAAAAATCAGCTGAT 

GACXAGAAAAAGCAACTGATAAAACACCAGAGCCAAATO 
1 S AGCAGCTGCTCCTGTCTTTATACCTTGGAGAGAAAGTGAC 

CGCTCATCAGCGGCTTACGTGAGACACTGGACAGGTGACTCTGCCTA 

GTTATGGGATTACTGCTGAACAGCTAGATGGTTTTTTGMCAGTCT 

CTTAAACGGAAAGCGTTFATTAGAATCGGAAAAACTAACAGGACTA^ 

GCAATGGCAGAAAGCTC^CTAGGTACTCAGGGAGTTGCTAAAGAA 
20 GCGCCTTTGACTTCAACCCAAACAATGCCAAAAAATACAGCGATGAGGTTC 

AGACACCATCATTGCCAACAAAAACCAAACCTTTGAAAGACAAGACCTCAAAGCAAAA 

GGCCAGTTGGATACCTTGATTGATGGTGGGGTTTACTITAtt 

CAGATATCATGACCAAACTAGACCAATGGATAGATGATCATGGAAGCACACCTGAGA 

CAAGATAACTTCCGGGACACAATTTAGCGAAGTGCCCGTAGGTTATAAAAGAAGTCA 
25 TTGACCTACAAGTCAGAGACCTACAGCTTTGGCCAATGCACTTGGTACGCCT 

TAGGTTATCAAGTCGACAGGTACATGGGTAACGGTGGCGACIXMCAGCGCAAG 

CCATAAACCTAAAGTGGGCTATGTCGTCTCATTTGCACCAGGCCAAGC^^ 

CACGTTGCTGTTGTAGAGCAAATCAAAGAAGATGGTTCTATOT 

TAGGCACCATTTCCTATCGGACGTTCACAGCTGAGCAGGCTAGTTTGTTGA 
30 ACTCCCAAGACCATAA 

Preferred GAS 193 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 51; and/or (b) which is a fragment of at least n 

35 consecutive amino acids of SEQ ID NO: 51 , wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). TTiese GAS 193 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 51. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 51. Other preferred fragments lack one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 

40 amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-tenninus of SEQ ID 
NO: 51. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(27) GAS 137 

GAS 137 corresponds to Ml GenBank accession numbers GI:13621842,GI:15674720and 
45 GI:30173478, to M3 GenBank accession number GI:21909998, to M18 GenBank accession number 
GI: 19745749 and is also referred to as 'Spy0652' (Ml), 'SpyM3J>462\ and t SpyM18_0713' (M18). 
Amino acid and polynucleotide sequences of GAS 137 of an Ml strain are set forth below: 

SEQ ED NO: 53 

MSDKHINLVIVTGMSGAGKTVAIQSFEDLGYFTIDNMPPALVPKFLELIEQ 
50 KEINSTLDSIESNPSIDFRILFLDATDGELVSRYKETRRSHPLAAIX5RVUXSIR 
VDTTKLTPRQLRKTISDQFSEGSNQASFRIEVMSFGFKYGLPLDADLVF 
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BDVPNYVMSHPESBVPYKHLliailVPILPAYQKEGKSVLTVAIGCTGGQHRSVA 
8SKRDQNRRKBTVNRS 

SEQ ID NO: 54 

5 ATGTCAGACAAACACATTAATTTAGTTATTGTCACAGGAATGAGCGG 

AGTCTTTTGAGGATCTAGGCTACTTTACCATTGATAATATGC 

ATTAATTQAACAMCCAATGAAAATCGTAGGGTGGCTTTGGTTCT 

AAGGAAATTAATTCTACCTTAGATAGTATTGAAAGCAATCCTAGCATTC 

ATGCAACGGATGGAGAATTGGTGTCACGCTATAAAGAAACC^ 
10 TC G TGTGCTT G ATGGTArrCGATTGGAAAGAGM 

GTGQATACAACAAAATTGACCCCTAGACAATTGCGTAAAACCATTTO 

ATCAAGCC TC TTTCCGTATTt^GTGATC^ 

GGTTTTTGATGTGCGTTTTCTACCCAATCCTTATT^ 

GAGGACGTTTTTAATTATGTGATGTCTCACCCAGAAT^ 
1 5 TTGTCCCTATCTTACCGGCTTACCAAAAAGAAGGGAAGTCTGT^ 

AGGCCAACACCGCAGCGTTGC CT TT G CCCATTGCTTGQCAGAAAGT^ 

GAAAGCCATCGTGATCAAAATCGTCGTAAGGAAACGGTGAATCGTTCATGA 

Preferred GAS 137 proteins for use with the invention comprise an amino acid sequence: (a) having 
20 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 53; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 53, wherein n is 7 or more (e.£. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, ISO, 200 or more). These GAS 137 proteins include variants (eg. 
allelic variants, homologs, ortbologs, paralogs, mutants, etc.) of SEQ ID NO: 53. Preferred fragments 
25 of (b) comprise an epitope from SEQ ID NO: 53. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 53. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

30 (28) GAS 084 

GAS 084 corresponds to Ml GenBank accession numbers GI:13622398 and GL15675229, to M3 
GenBank accession number GI: 21910442, to M18 GenBank accession number GI: 19746199 and is 
also referred to as 4 Spyl274' (Ml), 4 SpyM3J»06' and «SpyM18J223' (M18). GAS 084 has also 
been identified as a putative amino acid ABC transporter/periplasmic amino acid binding protein. 
35 Amino acid and polynucleotide sequences of GAS 084 of an Ml strain are set forth below: 

SEQ ID NO: 55 

M I I KKRTVAI LAI ASSFPLVA CQATKSLKSGDAWGVYQKQKSITVGPDNTFVPMGYKDESGRCKGFDIDL 
AKEVF HQ YGL KVN FQ A I NWDMKEAELNNGK I DVI WNGY S I TKERQDKVAFTDS YMRNEQI I WKKRSDI K 
T I S DMKHKVLGAQSAS S GYDS LLRTPKLLKDF I KNKDANQYETFTQAFI DLKSDR I DGI LI DKVYANYYL 
40 AKEGQLEN YRM I PTT FEN EAFS VGLRKED KTLQAKI NRAFRVLYQNG KFQAI SB KWFGDD VATAN I KS 

SEQ ID NO: 55 

ATGATTATAAAAAAAAGAACCGTAGCAATTTTAGCCATAGCTAGTA 

CTACTAAAAGTCTTAAATCAGGAGATGCTTGGGGAGTTTAC 
45 TGACAATACGTTTCTTCCTATGGGCTATAAGGATGAAAGCGGC 

GCTAAAGAAGTTTTTCACCAATATCGACTCAAGGTTAACTTTC 

CAGAACTAAACAATGGTAAAATTGATGTAATCTGGAATGGTTATTC^ 

GGTTGCCTITACTGATTCTTACATGAGAAATGAACAAACT^ 

ACAATATCAGATATGAAACATAAAGTGTTAGGAGCACAATCAGCTTCATC^ 
50 GAACTCCTAAACTGCTGAAAGATTTTAT^^ 
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TTTTATTGATTTAAAATCAGATCCTATCGATGGAATATTGATTC 
GCAAAAGAAGGGCAATTAGAGAATTATOGGATCATCCCAACGACCTTTGAA 
GACTTAGAAAAGAAQACAAAACGTTGCAAGCAAAAATTAATCGTGCTTTCA 
CAAATTTCAAGCTATTTCTGAGAAATGGTTTGGAGATGATGTT^ 

5 

Preferred GAS 084 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%. 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 55; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 55, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

10 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 084 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 55. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 55. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-tertninus of SEQ ID 

1 5 NO: 55. For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
SEQ ID NO: 55 is removed, ther fragments omit one or more domains of the protein (eg. omission of 
a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(29) GAS 384 

GAS 384 corresponds to Ml GenBank accession numbers GI: 13622908 and GI: 15675693, to M3 
20 GenBank accession number GI: 21911 154, to M18 GenBank accession number GI: 19746801 and is 
also referred to as «Spyl874* (Ml), 'SpyM3 J618' (M3), and *SpyM18J939' (M18). GAS 384 has 
also been identified as a putative glycoprotein endopeptidase. Amino acid and polynucleotide 
sequences of GAS 384 of an Ml strain are set forth below: 

SEQ ID NO: 57 

25 MKTIAFDTSNKTLSIAILDDBTLLADMTLN^ 

GLR VA VATAKTLA Y S LN I ALVG I S SL Y ALAASTCKQ YPNTL W PL I DARRQN A YVG YYRQG KS VM PQAHA 
SLEVI IEQLVEEGQLI FVGBTAPFABKIQKKLPQAI LLPTLPSAYECGLLGQSLAPENVDAFVPQYLKRV 
EAEENWLKDNE I KDDSHYVKR I 

30 SEQ ID NO: 58 

ATGAAGACACTTGCATTTGATACCTCAAATAAAACCTTGTCC 

TAGGAGATATGACCCTTAACATTCAGAAAAAACATAGTGTTAGCC 

GACTTGTACTGATCTTAAACCTCAAGATTTAGAAAGAATAGTGGTTGCAA 

GGTTTACGAGTGGCAGTTGCTACTGCAAAAACGTTAGCGTACAGT 
35 CGAGTCTATATGCTTTGGCTGCGTCTACTTGTAAACAGTATCCAAATA 

TGCTAGAAGGCAAAATGCGTATGTAGGTTATTATCGGCAAGGAAAATCAGTGAT 

TCACTAGAAGTTATTATAGAACAATTAGTAGAAGAAGGACAGCTGA 

TTGCTGAGAAAATTCAAAAGAAACTACCTCAGGC^ 

TGGTCTTTTGGGGCAAAGTTTGGCACCAGAAAATGTAGA 
40 GAAGCTGAAGAAAACTGGCTCAAAGATAATGAGATAAAAGATGATAGTCACT 

Preferred GAS 384 proteins for use with the invention comprise an amino acid sequence: (a) having 

50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 57; and/or (b) which is a fragment of at least n 

45 consecutive amino acids of SEQ ID NO: 57, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 384 proteins include variants (eg. 

allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 57. Preferred fragments 
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of (b) comprise an epitope from SEQ ID NO: 57. Other preferred fragments lack one or more amino 
acids (eg. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-tenmnus of SEQ ID 
NO: 57. Other fragments omit one or more domains of the protein (e.g. omission of a signal peptide, 
5 of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(30) GAS 202 

GAS 202 corresponds to Ml GenBank accession numbers GI:13622431 and GI: 15675258, to M3 
GenBank accession number GI: 2 1 9 1 0527, to M 1 8 GenBank accession number GI: 1 9746290 and is 
also referred to as < Spyl309 f (Ml), 'SpyM3.0991' (M3), 'SpyM18J32r (M18) and *dltD\ GAS 
1 0 202 has also been identified as a putative extramembranal protein. Amino acid and polynucleotide 
sequences of GAS 202 of an Ml strain are set forth below: 

SEQ ID NO: 59 

MLKRLWLILGPLLIAFVLWITIFSFPTQLDHSIAQEKANAVAITDSSFIOJGLIKRQALSDETCRFVPFP 
GSSEWSRMDSMHPSVLAERYKKSYRPFLIGKRGSASL£ 
15 PSAVQMYLSNTQVI E FLL KARTDKESQFAAKRLLELN PGVS KSNLLKKVSKGKS LSRLDRAI LKCQHQVA 
LREBSLFSFIXSKSTNYEKRILPRVKGLPKVFSYKQLN^ 
YKNFQVNYSYLASPEYNDFQLLI^EFAKRKTDVLFVITPVN^ 
FHRIADFSKDGGESYFMQDTIHIX3WNGWLAFDKKVQPFLETKQPVPNYKMNPYFYSKIW 

20 SEQ ID NO: 60 

ATGCTTMGAGACTCTGGTTAATTCTAGGTCCTCTTCTTATTG 

TTAGTTTTCCTACACAACTTGATCATTCCATAGCTCAGGAAAM 

TTCTTTTAAAAATGGTTTGATTAAAAGACAAGCTTTATCAGATGAGA 

GGTTCTAGCGAATGGAGTCGAATGGATAGTATGCACCCTTCG 
25 ATAGACCATTTTTAATTGGTAAGAGAGGATCAGCATCTTTGTC 

CAATGAAATGCAAAAGAAAAAAGCCATCTTTGTAGTATCTCCTCA^ 

CCTAGTGCGGTTCAGATGTACTTGTCTAACACTCAAGT^ 

AAGAATCACAGTTTGCAGCAAAGCGTTTGCTTGAGCT7AACCCTGGTC 

AAAAGTAAGTAAGGGTAAGTCTCTTAGTCGGTTAGACAGAGCTATTTTGAAATC 
30 TTGAGAGMGAGTCCCTTTTTAGTTTTTTAGGCAAATCT^ 

TTAAGGGATTACCTAAAGTATTTTCGTA^ 

AACAACCAACAACCGTTTTGGGATTAAAAATACATTT^ 

TATAAGAATTTCCAAGTTAATTATAGTTACCTGGCGTCACCAGAATACAATGA 

CAGAATTTGCTAAACGAAAAACAGATGTACTCTTTGTTATAACTCCTC 
35 TACCGGCTTAAATCAAGATAAGTATCAAGCGGCAGTTCGTAAAATAAAATTCCAGTTAAAGTCACAAGGA 

TTTCATCGCATTGCTGACTTCTCAAAAGATGGTGGTGAGTCCTACT^ 

GTTGGAATGGCTGGTTAGCTTTTGATAAGAMGTC 

CTATAAAATGAACCCTTATTTTTATA^ 

40 Preferred GAS 202 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 59; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 59, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18,20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 202 proteins include variants (eg. 

45 allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 59. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 59. Other preferred fragments lack one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
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NO: 59. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(31) GAS 057 

GAS 057 corresponds to Ml GenBank accession numbers Gl: 13621655 and GI: 15674549, to M3 
5 GenBank accession number Gl: 21909834, to M18 GenBank accession number GI: 19745560 and is 
also referred to as 4 Spy0416' (Ml), 'SpyM3 .0298' (M3), 'SpyM18J)464' (M18) and *prtS\ GAS 
057 has also been identified as a putative cell envelope proteinase. Amino acid and polynucleotide 
sequences of GAS 057 of an Ml strain are set forth below: 

SEQ ID NO: 61 

10 MBKKQRFSLiUCYKSCTFSVLICSVFLVffl^ 
TSQITLKTNREKEQSQDLVSEPTTTBLAOT^ 
KGWKWAVIDTGIDPAHQSMRISDVSTAKVKS 

ENQFBDFDBDWENFBFDAEABPKAIKKHKIYRPQSTQAPKETVIKTEETDGSHDIDWTQTDDOT 
MHVTGI VAGNSKEAAATGBR FLGI APBAQVMFMR VFANDIMGSABSLFI KAI EDAVALGADVI NLSLGTA 
1 5 HGAQLSGSKPLMKAI EKAKKAGVSVWAAGNERVYGSDHDDPLATN PDYGLVGSPSTGRTPTSVAAINSK 
WVIQRUTTVKELENRADLMHGKAI Y SESVDFKDI KDSLGYDKSHQFAYVKESTDAGYNAQDVKGKI ALI B 
RD PNKTYDEMI ALAKKHGALGVLI FNNKPGQSNRSMRLTANGMGI PSAFI SHEFGKAMSQLNGNGTGSLB 
FDSWSKAPSQKGNEMNHFSNWGLTSTCYLKPDITAPGGDIYSTYNDNHYGSQTCT 

KQYLEKTX}PNLPKEKIADIVKNLLNSNAQIHVNPETKTTTSPRQQGAG 
20 ISLGNITJXTMTFDVTVHNLSN^ 

VTMDVSQFTKELTKQM PNGYYLEGFVRFRDSQDDQLNRVN I PFVGFKGQFENLAVAEES I YRLKSQGKTG 
FYFDESGPKDDI YVGKHFTGLVTIXaSETNVSTKTI SDNGLHTX/n'FKHADGKFI LEKNAQGNPVLAISPN 

GDHKQDFAAFXGVFLRKYQGLKASWHASDKEHKNPLWVSPBSFKGDKNFNSDIRFAKST^ 
SLTGAELPDGHYHYWSYYPDWGAKRQEMTFDMI 
25 DSVFYLERKDNKPYTVTINDSYKYVSVEDNKTFVERQArc^ 
GDHLPQTUSKTPIKLKLTDGNYQTKETIJCDiaE^ 

PNEDGNKDFVAFKGLKNNVYNDLTVNVYAKDDHQKQTPI WS SQAGASVSA I BSTAWYGI TARGSKVMPGD 
YQYVVTYRDEHGKKHQKQYTI SVNDKKPMITQGRFDTINGVDHFTPDKTKALDS SGI VREE VF YLAKKNG 
RKFDVTEGKDGI TVSDNKVY I PKNPDGS YTI SKRDGVTLSDYYYLVEDRAQJVS FATLRDLKAVGKDKAV 
30 VNFGU)LPVPEDKQIVNFTYLVRDAIX5KPIEWLEYYNNSGNSLILPYGKYTVELLTYDTN 

SFTLSADNNFQQVTFKITMLATSQITAHFDHLLPEGSRVSLKTAQDQLIPLEQSLYVPKAYGKTVQEGTY 

BVWSLPKGYRIEGNTKVNTLPNBVHEI^LRLVKVGDASDSTGDHKVM 

AK ALPSTGEKWGLKLRI VGLVLLGLTCVPSRKKSTKD 

35 SEQ ID NO: 62 

GTGGAGAAAAAGCAACXSTTTTTCCCTTAGAAAATACAAATCAGGAACGTTTTCG^ 
TTTTCTTGGTGATGACAACAACAGTAGCAG CAGATGAGCTAAGCACAATGAGCG 
TCACGCTCAACAACMGCGCAAC^TCTCACCAATACAGAGTTGAGCT 
ACATCACAAATCACTCTCAAGACAAATCGTGAAAAAGAGCAATCACAAGATC 

40 CAACTGAGCTAGCTGAC^C^GATGCAGCATCAATGGCT^ 

TTCTTTACCGCCAGTCAATACAGATGTTCACGATTGGGTAAAAACCAAAGGAGCTTG 
AAAGGAC^GGCAAGGTTGTCXSCAGTTATTGACACAGGGATCGATCCGGC 
GTGATGTATCAACTGCTAAAGTAAAATCAAAAGAAGACATGCTAGGACGCCAAAAAGC 
TTATGGGAGTTGGATAAATGATAAAGTTGTTTTTGCACATAATTATGTGGAAAATA 

45 GAAAATCAATTCGAGGATTTTGATGAGGACTXX3GAAAAC 

CCATCAAAAAACACAAGATCTATCGTCCCCAATCAACCCAGGCACCGAAAGAAACTGT^ 
AGAAACAGATGGTTCACATGATATTGACTGGACACAAACAGACGATGACACCAAATACGAGTCAtt 
ATGCATGTGACAGGTATTGTAGCCGGTAATAGCAAAGAAGCCGCTGCTACTGGAGA^ 
TTGCJVCCAGAGGCCCAAGTCATGTTCATGCX3TGTTTTTGCCAA 

50 CTTTATCAAAGCTATCGAAGATGCaSTGGCTTTAGGAGCAGAT^ 
AATGGGGCACAGCTTAGTGGCAGCAAGCCTCTAATGG^^ 
CAGTTGTTGTAGCAGCAGGAAATGAGCGCGTCTATGGATCTGAC 

AGACTATGGTTTGGTCGGTTCTCC CTCAACAGGTCGAACAC C AACATCAGTGGCAGCT AT AAACAGT AAG 
TGGGTGATTCAACGTCTAATGACGGTCAAAGAATTAGAAAACCGTGCCGATT^ 
55 TCTATTCAGAGTCTGTCGACTTTAAAGACATAAAAGATA^ 
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TTATOTCAAAGAGTCAACTGATGCGGGTTATAACGCACAAGACGTTAAA^^ 

CGTGATCCCMTAAAACCTATGACGAAATGATTGCTTTGGCTAA 
TTTTTAATAACAAGCCTGGTCyVATCAAACCGCTCAATGCGTCTAACAGCT 

TGCmCATATCXK^CGAATTTGGTAAGGCCATGTC 
S TTTGACAGTCTGGTCTCAAAAGCACCGAGTCAAAAAGGCAATGA 

TAACTTCTGAT0GCTATTTAAAACCT6ACATTACTGCACCAGGTGGCGA 
TAACCACTATGGTAGCCAAACAGGAACAAGTATGGCCTCTCCTCAGATTGCT^^ 
AAAC AAT AC CT AGAAAAGACTCAGC C^AACTTGCCAAAAGAAAAAATTGCTGAT ATCGTT AAQAACCTAT 
TGATGAGCAATGCTCAAATTCATGTTMTCCAGAGA 
10 AGGATTACTTMTATT6ACGGAGCTGTCACTAGCGGCCTTTA7OT 
ATATCATTAGGCAACATCACAGATACGATGACGTTra 

AAACATTACGTTATGACACAGAATTGCTMCAGATCATGTAGACCCACAAA 

TTCTCACTCCTTAAAAACGTACCAAGGAGGAGAAGTTACAGTCCCAGCCAATGGA 

GTTACCATXK5ATGTCTCACAGTTCACAAAAGAGCTAACAAAACAGA 
1 5 GTTTnrrCCGCTTTAC^ WTACTOUVG ATGACCAACT 

AGGGCAATTTGAAAACTTAGCAGTTGCAGAAGAGTCCATTTACAGATTAAAATCT 

TTTTACTTTGATGAATCAGGTCCAAAAGACX^TATCT 

TTGGTTCAGAGACCAATGTGTCAACCAAMCGATTTCTGACAATGGTCT 

AAATGCAGATGGCAAATTTATCITAGAAAAAAATGCCCAAGG 
20 GGT(^CAACAACCAAGATTTTGCAGCCTTCAAAGGTGTTTTCTTGAGAA 

GTGTCTACCATGCTAGTGACAAGGAACACAAAAATCCACTGTGGGT 

TAAAAACTTTAATAGTGACATTAGATTTGCAAAATCAACGACCCTGTTAGGCA 

TCGTTAACAGGAGCTGAATTACCAGATGGGCATTATCATTAT^ 

GTGCCAAACGTCAAGAMTGACATTTGACATGATTTT^ 
25 ATTTGATCCTGAAACAAACCGATTCAAACCAGAACCCCTAAAAGACCGTGGATTAGCTGG 

GACAGTGTCTTTTATCTAGAAAGAAAAGACAACAAGCCTTATACAGTTACGATAAAC 

ATGTCTCAGTAGAAGACAATAAAAC^TTTGTGGAGCGACAAGCTGATGGCAGCTTTA 

TAAAGC^AAATTAGGGGATTTCTATTACATGGTCGAGGATTTTC 

GGAGATCACITACCACAAACATTAGGTAAAACACCAATTAAAOT 
30 CCAAAGAAACGCTTAAAGATAATCITCAAATGACACAGTCTGACACA 

GCTAGCAGTGGTGCACCXJCAATCAGCCGCAAAGCCAGCTAACAAAGATGAATCA 

CCAAACGAAGATGGGAATAAAGACTITGTGGCCTTTAAAGGCTTGAAAAATA^ 

(XXTTTAACGTATACGCTAAAGATGACCACCAAAAACAAACC^ 

TGTATCCGCTATTGAAAGTACAGCCTGGTATGGCATAACAGCCCGAGGAAGCAAGGTGATGCCA 

35 TATCAGTATGTTCTGACCTATCGTGACGAACATGGTAAAGAACATCAAAAGCAGTACA 

ATGACAAAAAACCAATGATCACTCAGGGACGTTTTGATACCACT 

CAAGACAAAAGCCCTTGACTCATCAGGCATTGTCCGCGAAGAAGTC 

CGTAAATTTGATGTGACAGAAGGTAAAGATGGTATCACAGTTAGTGACAATAAGGTC 

ATCCAGATGGTTCTTAC^CCATTTCAAAA 
40 AGATAGAGCTGGTAATGTGTCTTTTGCTACCTTGCGTGA 

GTCAATTTTGGATTAGACTTACCGGTCCCTGAAGACAAACAAATAGT^ 

ATGCAGATGGTAAACCGATTGAAAACCTAGAGTATTATAATAACTCAGGT^ 

CGGCAAATACACGGTCGAATTGTTGACCTAT^ 

TCCTTTACCTTGTCAGCTGATAACAACTTCCAACAAGTTACCTTTAAGATAACGATC 
45 AAATAACTGCCCACTTTGATCATCTTTTCCCAGAAGGCAGTCGC 

GCTAATCCCGCTTGAACAGTCCTTGTATGTGCCTAAAGCTTATGGCAAAAC^ 

GAAGTTGTTGTCAGCCTGCCTAAAGGCTACCGTATCGAAGGCAACACAAAGGTGAATACCCTACC^ 

AAGTGCACGAACTATCATTACGCCTTGTCAAAGTAGG^ 

TATGTCAAAAAATAATTCACAGGCTTTGUVCAGCCTCTGCCACACCAACCAAGTCAACGA 
50 GCAAAAGCC CTACCATC^CGGGTGAAAAAATGGGTCTCAAGTTGCGCATO 
GACTTACTTGCGTCTTTAGCCGAAAAAAATCAACCAAAGATTGA 

Preferred GAS 057 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
55 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 61 ; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 61 , wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 057 proteins include variants (e.g. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 61. Preferred fragments 

-32- 



r 



At 1 1 rsxs i*va 1 1 *wvv.w* 

of (b) comprise an epitope from SEQ ID NO: 61. Other preferred fragments lack one or more amino 
acids (e.g. 1,2,3,4,5,6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 61 . For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
5 SEQ ID NO: 61 is removed. In another example, the underlined amino acid sequence at the C- 
terminus of SEQ ID NO: 61 is removed Other fragments omit one or more domains of the protein 
(eg. omission of a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an 
extracellular domain). 

The immunogenicity of other known GAS antigens may be improved by combination with two or 
more GAS the first antigen group. Such other known GAS antigens include a second antigen group 
consisting of (1) one or more variants of the M surface protein or fragments thereof, (2) fibronectin- 
binding protein, (3) streptococcal heme-associated protein, or (4) SagA. These antigens are referred 
to herein as the "second antigen group". 

The invention thus includes an immunogenic composition comprising a combination of GAS 
antigens, said combination consisting of two to thirty-one GAS antigens of the first antigen group and 
one, two, three, or four GAS antigens of the second antigen group. Preferably, the combination 
consists of three, four, five, six, seven, eight, nine, or ten GAS antigens from the first antigen group. 
Still more preferably, the combination consists of three, four or five GAS antigens from the first 
antigen group. Preferably, the combination of GAS antigens includes either or both of GAS 40 and 
GAS 1 17. Preferably, the combination of GAS antigens includes one or more variants of the M 
surface protein. 

Each of the GAS antigens of the second antigen group are described in more detail below. 
(J) At surface protein 

Over 100 different type variants of the M protein have been identified. Epitopes having increased 
25 bactericidal activity and having decreased likelihood of cross-reacting with human tissues have been 
identified in the amino terminal region and combined into fusion proteins containing approximately 
six, seven, or eight M protein fragments linked in tandem. See Ref. 4, 5, 6, WO 02/094851 and WO 
94/06465. (Each of the M protein variants, fragments and fusion proteins described in these 
references are specifically incorporated herein by reference.) 

30 Accordingly, the compositions of the invention may further comprise a GAS M surface protein or a 
fragment or derivative thereof. One or more GAS M surface protein fragments may be combined 
together in a fusion protein. Alternatively, one or more GAS M surface protein fragments are 
combined with a GAS antigen or fragment thereof of the first antigen group. One example of a GAS 
M protein is set forth below. 

35 SEQ ID NO: 63 

MAKNNTNRHYSLRKLKTGTAS VAVALTVLGAGFANQTBVKANGDGN PRBVI EDLAANN PAI QN I RLRY EN 
KDLKARLENAMEVAGRDFKRAEELEKAKQA^ 
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KBALEIAX DQAS RDYHRATALBKBLEBKKKALBLAI DQASQDYNRAHVU2KBLBTI TRBQBI NRNLLGH A 
KWLDOWSEKEQLTIBKAKI^BBKQISDASRQSUUU)^^ 

C)ASRQGLRRDL0ASREAKKQVBKDLAHLTABLOKVKEB KQI S DAS RQGLRRDLDASREAKKQVE KALE RA 
NSKLAALBKLNKBLBSSKXLTBKBKABWAKI^^ 
5 KAVPGKGQAPQACTKPNQNKAPMKBTKRQLPSTGBT^ 

Preferred GAS M proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 6S%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 63; and/or (b) which is a fragment of at least n 

10 consecutive amino acids of SEQ ID NO: 63, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These GAS M proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 63. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 63. Preferably, the fragment is one of those described in the 
references above. Preferably, the fragment is constructed in a fusion protein with one or more 

1 5 additional M protein fragments. Other preferred fragments lack one or more amino acids (e.g. 1 , 2, 3, 
4, 5, 6, 7, 8, 9, 1 0, 1 5, 20, 25 or more) from the C- terminus and/or one or more amino acids (e.g. 1 , 2, 
3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 63. Other fragments 
omit one or more domains of the protein (eg. omission of a signal peptide, of a cytoplasmic domain, 
of a transmembrane domain, or of an extracellular domain). 

20 (2) Fibronectin-binding protein 

GAS fibronectin-binding protein ('SIM') is a mutlifunctional bacterial protein thought to mediate 
attachment of the bacteria to host cells, facilitate bacterial internalization into cells and to bind to the 
Fc fragment of human IgG, thus interfering with Fc-receptor mediated phagocytosis and antibody- 
dependent cell cytotoxicity. Immunization of mice with Sfbl and an 'H12 fragment* (encoded by 
25 positions 1240 - 1854 of the Sfbl gene) are discussed in Refs. 7,8 and 9. One example of an amino 
acid sequence for GAS Sfbl is show below. 

SEQ ID NO: 64 

MSFDGFFXHHLTNELKENLLYGRIQKVNQ^ 

VPNTFTM I MRKYLQGAV I EQLEQI DNDR 1 1 E I KVSNKNE I GDA I QATL I I B IMGKH SN I I LVDRAENKI I 

30 ESIKHVGFSQNSYRTILPGSTYIEPPKTAAVNPPTITDVPLFEILQTQELTVK^ 

AELLTTDKLKRPREPFARPTQANLTTASPAPVLFSDSHATPETLSDMLDHFYQDKABRDRINQQASDIilH 
RVQTELDKNRNKLS KQEAELLATEN AELPRQKGBLLTTYLSLVPNNQD SV1 LDNYYTGE KI EI ALDKALT 
PNQNAQR YFKXYQKLKEAVKHL SGL I ADTKQS I TYFES VDYNLSQAS I DDI ED I RE EL YQAGFL KSRQRD 
KRHKRKKPEQ YLASDGTT I LMVGRNNLQNEELTFKMAKKGELWFHAKD I PGSHVI I KDNLDPS DEVKTDA 

35 AELAAY YSKARLSN LVQ VDMI EAKKLHKPSGAKPGFVTYTGQKTLRVTPDQAKI LSMKLS 

Preferred Sfbl proteins for use with the invention comprise an amino acid sequence: (a) having 50% 

or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 64; and/or (b) which is a fragment of at least n 

40 consecutive amino acids of SEQ ID NO: 64, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

30, 35, 40, 50, 60, 70, 80, 90, 100, or more). These Sfbl proteins include variants (eg. allelic variants, 

homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 64. Preferred fragments of (b) comprise 

an epitope from SEQ ID NO: 64. Preferably, the fragment is one of those described in the references 

above. Other preferred fragments lack one or more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 
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20, 25 or more) from the C-terminus and/or one or more amino acids (eg. 1,2,3, 4, 5, 6, 7, 8, 9, 1 0, 
IS, 20, 25 or more) from the N-terminus of SEQ ID NO: 64. Other fragments omit one or more 
domains of the protein (eg. omission of a signal peptide, of a cytoplasmic domain, of a 
transmembrane domain, or of an extracellular domain). 

5 (3) Streptococcal hme-assodated protein 

The GAS streptococcal heme-associated protein ('Shp*) has been identified as a GAS cell surface 
protein. It is thought to be cotrascribed with genes encoding homologues of an ABC transporter 
involved in iron uptake in gram-negative bacteria The Shp protein is further described in 10. One 
example of a Shp protein is shown below: 

10 SEQ ID NO: 65 

MTKWIKQLLQVIWFMISLSTMTNLV^ 

VYSDAMLBVSDAGKI VLTFRMS LADY SGNYQFW I QPGGTGSFOAVDYN I TQKGTDTNGTTLD I AI SLPTV 
NS 1 1 RGSMFVE PMGREWF YLSASEL IQKYSGNMIAQLVTETDNSQNQBVKDSQKPVDTKLGBSQDES HT 
GAMITQNKPKANSSITOKSLSDKKILPSKMGLTTSLELKKED^ 
IS WKKRKKNDKTM 

Preferred Shp proteins for use with the invention comprise an amino acid sequence: (a) having 50% or 
more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 
98%, 99%, 99.3% or more) to SEQ ID NO: 65; and/or (b) which is a fragment of at least n 

20 consecutive amino acids of SEQ ID NO: 65, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100 or more). These Shp proteins include variants (eg. allelic variants, 
bomologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 65. Preferred fragments of (b) comprise 
an epitope from SEQ ID NO: 65. Other preferred fragments lack one or more amino acids (e.g. 1, 2, 
3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino acids (e.g. 1, 

25 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 65. Other fragments 
omit one or more domains of the protein (eg. omission of a signal peptide, of a cytoplasmic domain, 
of a transmembrane domain, or of an extracellular domain). 

(4) Sag A 

Streptolysin S (SLS), also known as 4 SagA\ is thought to be produced by almost all GAS colonies. 
30 This cytolytic toxin is responsible for the beta-hemolysis surrounding colonies of GAS grown on 
blood agar and is thought to be associated with virulence. While the full SagA peptide has not been 
shown to be immunogenic, a fragment of amino acids 10-30 (SagA 10 - 30) has been used to 
produce neutralizing antibodies. See Ref. 1 1 . The amino acid sequence of SagA 10 - 30 is shown 
below: 

35 SEQ ID NO: 66 FS1ATGSGNSQGGSGSYTPGKC 

Preferred SagA 10-30 proteins for use with the invention comprise an amino acid sequence: (a) 
having 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 
95%, 96%, 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 66; and/or (b) which is a fragment of at 
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least n consecutive amino acids of SEQ ID NO: 66, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 
or 20). These SagA 10 - 30 proteins include variants (eg. allelic variants, homologs, orthologs, 
paralogs, mutants, etc) of SEQ ID NO: 66. 

There is an upper limit to the number of GAS antigens which will be in the compositions of the 
S invention. Preferably, the number of GAS antigens in a composition of the invention is less than 20, 
less than 19, less than 18, less than 17, less than 16, less than 15, less than 14, less than 13, less than 
12, less than 11, less than 10, less than 9, less than 8, less than 7, less than 6, less than 5, less than 4, 
or less than 3. Still more preferably, the number of GAS antigens in a composition of the invention is 
less than 6, less than 5, or less than 4. Still more preferably, the number of GAS antigens in a 
1 0 composition of the invention is 3. 

The GAS antigens used in the invention are preferably isolated, i.e., separate and discrete, from the 
whole organism with which the molecule is found in nature or, when the polynucleotide or 
polypeptide is not found in nature, is sufficiently free of other biological macromolecules so that the 
polynucleotide or polypeptide can be used for its intended purpose. 

15 Fusion proteins 

The GAS antigens used in the invention may be present in the composition as individual separate 
polypeptides, but it is preferred that at least two (i.e. 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 17, 
18, 19 or 20) of the antigens are expressed as a single polypeptide chain (a "hybrid* polypeptide). 
Hybrid polypeptides offer two principal advantages: first, a polypeptide that may be unstable or 
20 poorly expressed on its own can be assisted by adding a suitable hybrid partner that overcomes the 
problem; second, commercial manufacture is simplified as only one expression and purification need 
be employed in order to produce two polypeptides which are both antigenically useful. 

The hybrid polypeptide may comprise two or more polypeptide sequences from the first antigen 
group. Accordingly, the invention includes a composition comprising a first amino acid sequence and 
25 a second amino acid sequence, wherein said first and second amino acid sequences are selected from a 
GAS antigen or a fragment thereof of the first antigen group. Preferably, the first and second amino 
acid sequences in the hybrid polypeptide comprise different epitopes. 

The hybrid polypeptide may comprise one or more polypeptide sequences from the first antigen group 
and one or more polypeptide sequences from the second antigen group. Accordingly, the invention 
30 includes a composition comprising a first amino acid sequence and a second amino acid sequence, 
said first amino acid sequence selected from a GAS antigen or a fragment thereof from the first 
antigen group and said second amino acid sequence selected from a GAS antigen or a fragment 
thereof from the second antigen group!. Preferably, the first and second amino acid sequences in the 
hybrid polypeptide comprise different epitopes. 
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Hybrids consisting of amino acid sequences from two, three, four, five, six, seven, eight, nine, or ten 
GAS antigens are preferred, In particular, hybrids consisting of amino acid sequences from two, 
three, four, or five GAS antigens art preferred. 

Different hybrid polypeptides may be mixed together in a single formulation. Within such 

■ 

5 combinations, a GAS antigen may be present in more than one hybrid polypeptide and/or as a 
non-hybrid polypeptide. It is preferred, however, that an antigen is present either as a hybrid or as a 
non-hybrid, but not as both. 

Hybrid polypeptides can be represented by the formula NHj-A-J-X-L-k-B-COOH, wherein: X is an 
amino acid sequence of a GAS antigen or a fragment thereof from the first antigen group or the 
10 second antigen group; L is an optional linker amino acid sequence; A is an optional N-tenninal amino 
acid sequence; B is an optional C-terminal amino acid sequence; and n is 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 
12,13, 14 or 15. 

If a -X- moiety has a leader peptide sequence in its wild-type form, this may be included or omitted in 
the hybrid protein. In some embodiments, the leader peptides will be deleted except for that of the -X- 
15 moiety located at the N-terminus of the hybrid protein t.e the leader peptide of Xi will be retained, 
but the leader peptides of X 2 . . . X,, will be omitted. This is equivalent to deleting all leader peptides 
and using the leader peptide of Xi as moiety -A-. 

For each n instances of {-X-L-} , linker amino acid sequence -L- may be present or absent. For 
instance, when rv=2 the hybrid may be NH r X r L r X r L2-COOH, NH 2 -X,-X 2 -COOH, NH r X r L r X 2 - 

20 COOH, NH 2 -XrX 2 -LrCOOH, etc Linker amino acid sequenced) -L- will typically be short (e.g. 20 
or fewer amino acids te. 19, 18, 17, 16, 15, 14, 13, 12, 1 1, 10, 9, 8,. 7, 6, 5, 4, 3, 2, 1). Examples 
comprise short peptide sequences which facilitate cloning, poly-glycine linkers (i.e. comprising Gly A 
where n « 2, 3, 4, 5, 6, 7, 8, 9, 10 or more), and histidine tags (ie. His„ where n a J,4, 5, 6, 7, 8, 9, 10 
or more). Other suitable linker amino acid sequences will be apparent to those skilled in the art A 

25 useful linker is GSGGGG, with the Gly-Ser dipeptide being formed from a BamHl restriction site, 
thus aiding cloning and manipulation, and the (Gly) 4 tetrapeptide being a typical poly-glycine linker. 

-A- is an optional N-terminal amino acid sequence. This will typically be short (eg. 40 or fewer 
amino acids ie. 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 
16, 15, 14, 13, 12, 11, 10,9,8,7,6,5,4,3,2, 1). Examples include leader sequences to direct protein 
30 trafficking, or short peptide sequences which facilitate cloning or purification (e.g. histidine tags i.e. 
His* where n - 3, 4, 5, 6, 7, 8, 9, 10 or more). Other suitable N-terminal amino acid sequences will be 
apparent to those skilled in the art. If X, lacks its own N-terminus methionine, -A- is preferably an 
oligopeptide (e.g. with 1,2, 3, 4, 5, 6, 7 or 8 amino acids) which provides a N-terminus methionine. 

-B- is an optional C-terminal amino acid sequence. This will typically be short (eg. 40 or fewer 

35 amino acids i.e 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 

16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1). Examples include sequences to direct protein 

trafficking, short peptide sequences which facilitate cloning or purification (eg. comprising histidine 
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tags ie His. where n - 3, 4, 5, 6, 7, 8 f 9 t 10 or more), or sequences which enhance protein stability. 
Other suitable Oterminal amino acid sequences will be apparent to those skilled in (he art. 

Most preferably, n is 2 or 3. 

The invention also provides nucleic acid encoding hybrid polypeptides of the invention. Furthermore, 
5 the invention provides nucleic acid which can hybridise to this nucleic acid, preferably under "high 
stringency" conditions (eg. 6S°C in a 0. IxSSC, 0.3% SDS solution). 

Polypeptides of the invention can be prepared by various means (eg. recombinant expression, 
purification from cell culture, chemical synthesis, etc.) and in various forms (eg. native, fusions, 
non-gjycosylated, lipidated, etc.). They are preferably prepared in substantially pure form (i.e 
1 0 substantially free from other GAS or host cell proteins). 

Nucleic acid according to the invention can be prepared in many ways (eg. by chemical synthesis, 
from genomic or cDN A libraries, from the organism itself, etc.) and can take various forms (eg. 
single stranded, double stranded, vectors, probes, etc.). They are preferably prepared in substantially 
pure form (i.e. substantially free from other GAS or host cell nucleic acids). 

1 5 The term "nucleic acid" includes DN A and RNA, and also their analogues, such as those containing 
modified backbones (eg. phosphorothioates, eTc), and also peptide nucleic acids (PNA), etc. The 
invention includes nucleic acid comprising sequences complementary to those described above (eg. 
for anti sense or probing purposes). 

The invention also provides a process for producing a polypeptide of the invention, comprising the 
20 step of culturing a host cell transformed with nucleic acid of the invention under conditions which 
induce polypeptide expression. 

The invention provides a process for producing a polypeptide of the invention, comprising the step of 
synthesising at least part of the polypeptide by chemical means. 

The invention provides a process for producing nucleic acid of the invention, comprising the step of 
25 amplifying nucleic acid using a primer-based amplification method (eg. PCR). 

The invention provides a process for producing nucleic acid of the invention, comprising the step of 
synthesising at least part of the nucleic acid by chemical means. 

Strains 

Preferred polypeptides of the invention comprise an amino acid sequence found in an Ml, M3 or M18 
30 strain of GAS. The genomic sequence of an Ml GAS strain is reported at Ref. 12. The genomic 

sequence of an M3 GAS strain is reported at Ref. 13. The genomic sequence of an M18 GAS strain is 
reported at Ref. 14. 

Where hybrid polypeptides are used, the individual antigens within the hybrid (i.e individual -X- 
moieties) may be from one or more strains. Where n-2, for instance, X 2 may be from the same strain 
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as X| or from a different strain. Where n-3, the strains might be (i) X1-X2-X3 (ii) XrX^Xj (iii) 
X,/X 2 -X, (iv) XiOC^X) or (v) X,-Xjtt,, etc 

Purification and Recombinant Expression 

The GAS antigens of the invention may be isolated from a Streptococcus pyogenes ; or they may be 
5 recombinant^ produced, for instance, in a heterologous host Preferably, the GAS antigens are 
prepared using a heterologous host. The heterologous host may be prokaryotic (eg. a bacterium) or 
eukaryotic. It is preferably E.coli, but other suitable hosts include Bacillus subtilis, Vibrio cholerae. 
Salmonella typhi, Salmonella typhimurium, Neisseria lactamica. Neisseria cinerea, Mycobacteria 
(eg. M.tuberculosis), yeasts, etc. 

10 Recombinant production of polypeptides is facilitated by adding a tag protein to the GAS antigen to 
be expressed as a fusion protein comprising the tag protein and the GAS antigen. Such tag proteins 
can facilitate purification, detection and stability of the expressed protein. Tag proteins suitable for 
use in the invention include a polyarginine tag (Arg-tag), polyhistidine tag (His-tag), FLAG-tag, 
Strep-tag, c-myc-tag, S-tag, calmodu! in-binding peptide, cellulose-binding domain, SBP-tag,, chitin- 

1 S binding domain, glutathione S-transferase-tag (GST), maltose-binding protein, transcription 
termination anti-terminiantion factor (NusA), £. coli thioredoxin (TrxA) and protein disulfide 
isomerase I (DsbA). Preferred tag proteins include His-tag and GST. A full discussion on the use of 
tag proteins can be found at Ref. 1 5. 

After purification, the tag proteins may optionally be removed from the expressed fusion protein, i.e., 
20 by specifically tailored pnzymatic treatments known in the art. Commonly used proteases include 
enterokinase, tobacco etch virus (TEV), thrombin, and factor X,. 

Immunogenic compositions and medicaments 

Compositions of the invention are preferably immunogenic compositions, and are more preferably 
vaccine compositions. The pH of the composition is preferably between 6 and 8, preferably about 7. 
25 The pH may be maintained by the use of a buffer. The composition may be sterile and/or 
pyrogen-free. The composition may be isotonic with respect to humans. 

Vaccines according to the invention may either be prophylactic (i.e to prevent infection) or 
therapeutic (i.e. to treat infection), but will typically be prophylactic. Accordingly, the invention 
includes a method for the therapeutic or prophylactic treatment of a Streptococcus pyogenes infection 

30 in an animal susceptible to streptococcal infection comprising administering to said animal a 

therapeutic or prophylactic amount of the immunogenic compositions of the invention. Preferably, 
the immunogenic composition comprises a combination of GAS antigens, said combination consisting 
of two to thirty-one GAS antigens of the first antigen group. Preferably, the combination of GAS 
antigens consists of three, four, five, six, seven, eight, nine, or ten GAS antigens selected from the 

35 first antigen group. Preferably, the combination of GAS antigens consists of three, four, or five GAS 
antigens selected from the first antigen group. Preferably, the combination of GAS antigens includes 
either or both of GAS 40 and GAS 117. 
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Alternatively, the invention includes an immunogenic composition comprising a combination of GAS 
antigens, said combination consisting of two to thirty-one GAS antigens of the first antigen group and 
one, two, three, or four GAS antigens of the second antigen group. Preferably, the combination 
consists of three, four, G ve, six, seven, eight, nine, or ten GAS antigens from the first antigen group. 
5 Still more preferably, the combination consists of three, four or five GAS antigens from the first 
antigen group. Preferably, the combination of GAS antigens includes either or both of GAS 40 and 
GAS 1 1 7. Preferably, the combination of GAS antigens includes one or more variants of the M 
surface protein. 

The invention also provides a composition of the invention for use as a medicament. The medicament 
1 0 is preferably able to raise an immune response in a mammal (re it is an immunogenic composition) 
and is more preferably a vaccine. 

The invention also provides the use of the compositions of the invention in the manufacture of a 
medicament for raising an immune response in a mammal. The medicament is preferably a vaccine. 

The invention also provides for a kit comprising a first component comprising a combination of GAS 
1 5 antigens. In one embodiment, the combination of GAS antigens consists of a mixture of two to thirty- 
one GAS antigens selected from the first antigen group. Preferably, the combination consists of three, 
four, five, six, seven, eight, nine, or ten GAS antigens from the first antigen group. Preferably, the 
combination consists of three, four, or five GAS antigens from the first antigen group. Preferably, the 
combination includes either or both of GAS 1 17 and GAS 040. 

20 In another embodiment, the kit comprises a first component comprising a combination of GAS 

antigens consisting of a mixture of two to thirty-one GAS antigens of the first antigen group and one, 
two, three, or four GAS antigens of the second antigen group. Preferably, the combination consists of 
three, four, five, six, seven, eight, nine, or ten GAS antigens from the first antigen group. Still more 
preferably, the combination consists of three, four or five GAS antigens from the first antigen group. 

25 Preferably, the combination of GAS antigens includes either or both of GAS 40 and GAS 1 1 7. 

Preferably, the combination of GAS antigens includes one or more variants of the M surface protein. 

The invention also provides a delivery device pre-filled with the immunogenic compositions of the 
invention. 

The invention also provides a method for raising an immune response in a mammal comprising the 
30 step of administering an effective amount of a composition of the invention. The immune response is 
preferably protective and preferably involves antibodies and/or cell-mediated immunity. The method 
may raise a booster response. 

The mammal is preferably a human. Where the vaccine is for prophylactic use, the human is 
preferably a child (eg. a toddler or infant) or a teenager; where the vaccine is for therapeutic use, the 
35 human is preferably a teenager or an adult. A vaccine intended for children may also be administered 
to adults eg. to assess safety, dosage, immunogenicity, etc 
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These uses and methods are preferably for the prevention and/or treatment of a disease caused by 
Streptococcus pyogenes (eg. pharyngitis (such as streptococcal sore throat), scarlet fever, impetigo, 
erysipelas, cellulitis, septicemia* toxic shock syndrome, necrotizing fasciitis (flesh eating disease) and 
sequelae (such as rheumatic fever and acute glomerulonephritis)). The compositions may also be 
5 effective against other streptococcal bacteria 

One way of checking efficacy of therapeutic treatment involves monitoring GAS infection after 
administration of the composition of the invention. One way of checking efficacy of prophylactic 
treatment involves monitoring immune responses against the GAS antigens in the compositions of the 
invention after administration of the composition. 

10 Compositions of the invention will generally be administered directly to a patient. Direct delivery may 
be accomplished by parenteral injection (eg. subcutaneously, intraperitoneally, intravenously, 
intramuscularly, or to the interstitial space of a tissue), or by rectal, oral (eg. tablet, spray), vaginal, 
topical, transdermal {eg. seeref. 16} or transcutaneous {eg. seerefs. 17 & 18}, intranasal {eg. see 
ref. 19}, ocular, aural, pulmonary or other mucosal administration. 

1 5 The invention may be used to elicit systemic and/or mucosal immunity. 

Dosage treatment can be a single dose schedule or a multiple dose schedule. Multiple doses may be 
used in a primary immunisation schedule and/or in a booster immunisation schedule. In a multiple 
dose schedule the various doses may be given by the same or different routes eg. a parenteral prime 
and mucosal boost, a mucosal prime and parenteral boost, etc. 

20 The compositions of the invention may be prepared in various forms. For example, the compositions 
may be prepared as injectables, either as liquid solutions or suspensions. Solid forms suitable for 
solution in, or suspension in, liquid vehicles prior to injection can also be prepared (eg. a lyophilised 
composition). The composition may be prepared for topical administration eg. as an ointment, cream 
or powder. The composition may be prepared for oral administration eg. as a tablet or capsule, as a 

25 spray, or as a syrup (optionally flavoured). The composition may be prepared for pulmonary 

administration eg. as an inhaler, using a fine powder or a spray. The composition may be prepared as 
a suppository or pessary. The composition may be prepared for nasal, aural or ocular administration 
eg. as drops. The composition may be in kit form, designed such that a combined composition is 
reconstituted just prior to administration to a patient. Such kits may comprise one or more antigens in 

30 liquid form and one or more lyophilised antigens. 

Immunogenic compositions used as vaccines comprise an immunologically effective amount of 
antigen(s), as well as any other components, as needed. By 'immunologically effective amount 9 , it is 
meant that the administration of that amount to an individual, either in a single dose or as part of a 
series, is effective for treatment or prevention. This amount varies depending upon the health and 
35 physical condition of the individual to be treated, age, the taxonomic group of individual to be treated 
(eg. non-human primate, primate, etc), the capacity of the individual's immune system to synthesise 
antibodies, the degree of protection desired, the formulation of the vaccine, the treating doctor's 
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assessment or the medical situation, and other relevant factors. It is expected that the amount will fall 
in a relatively broad range that can be determined through routine trials. 

Further components of the composition 

The composition of the invention will typically, in addition to the components mentioned above, 
5 comprise one or more 'pharmaceutical^ acceptable earners . which include any carrier that does not 
itself induce the production of antibodies harmful to the individual receiving the composition. 
Suitable carriers are typically large, slowly metabolised macromolecules such as proteins, 
polysaccharides, polylactic acids, polyglycolic acids, polymeric amino acids, amino acid copolymers, 
and lipid aggregates (such as oil droplets or liposomes). Such carriers are well known to those of 
10 ordinary skill in the art. The vaccines may also contain diluents, such as water, saline, glycerol, etc. 
Additionally, auxiliary substances, such as wetting or emulsifying agents, pH buffering substances, 
and the like, may be present. A thorough discussion of pharmaceutical^ acceptable excipients is 
available in reference 20. 

Vaccines of the invention may be administered in conjunction with other immunoregulatory agents. In 
1 S particular, compositions will usually include an adjuvant 

Preferred further adjuvants include, but are not limited to, one or more of the following set forth 
below: 

A. Mineral ConfrininR QmwitjoDS 

Mineral containing compositions suitable for use as adjuvants in the invention include mineral salts, 
20 such as aluminium salts and calcium salts. The invention includes mineral salts such as hydroxides 
(eg. oxyhydroxtdes), phosphates (eg. hydroxyphoshpates, orthophosphates), sulphates, etc. {eg. see 
chapters 8 & 9 of ref. 21 }), or mixtures of different mineral compounds, with the compounds taking 
any suitable form (eg. gel, crystalline, amorphous, etc), and with adsorption being preferred. The 
mineral containing compositions may also be formulated as a particle of metal salt. See ref. 22. 

25 B. Oil-Emulsions 

Oil-emulsion compositions suitable for use as adjuvants in the invention include squalene-water 
emulsions, such as MF59 (5% Squalene, 0.5% Tween 80, and 0.5% Span 85, formulated into 
submicron particles using a microfluidizer). See ref. 23. 

Complete Freund's adjuvant (CFA) and incomplete Freund's adjuvant (IFA) may also be used as 
30 adjuvants in the invention. 

C. Saponin Formulations 

Saponin formulations, may also be used as adjuvants in the invention. Saponins are a heterologous 
group of sterol glycosides and triterpenoid glycosides that are found in the bark, leaves, stems, roots 
and even flowers of a wide range of plant species. Saponin from the bark of the Qui I lata saponaria 
35 Molina tree have been widely studied as adjuvants. Saponin can also be commercially obtained from 
Smilax ornata (sarsaprilla), Gypsophilla panicuiata (brides veil), and Saponaria qfficianaiis (soap 
root). Saponin adjuvant formulations include purified formulations, such as QS21, as well as lipid 
formulations, such as ISCOMs. 

• 
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Saponin compositions have been purified using High Performance Thin Layer Chromatography (HP- 
LC) and Reversed Phase High Performance Liquid Chromatography (RP-HPLC). Specific purified 
fractions using these techniques have been identified, including QS7, QS17, QS1 8, QS21, QH-A, QH- 
B and QH-C Preferably, the saponin is QS2 1. A method of production of QS21 is disclosed in U.S. 
5 Patent No. 5,057,540. Saponin formulations may also comprise a sterol, such as cholesterol (see WO 
96/33739). . 

Combinations of saponins and cholesterols can be used to form unique particles called 
Immunostimulating Complexs (ISCOMs). ISCOMs typically also include a phospholipid such as 
phosphatidylethanolamine or phosphatidylcholine. Any known saponin can be used in ISCOMs. 
10 Preferably, the ISCOM includes one or more of Qui! A, QHA and QHC. ISCOMs are further 
described in EP 0 109 942, WO 96/1 171 1 and WO 96/33739. Optionally, the ISCOMS may be 
devoid of additional detergent. See ref. 24. 

A review of the development of saponin based adjuvants can be found at ref. 25. 

C. V ir wn re ami Viru s Lflre Parities Ws) 

15 Virosomes and Virus Like Particles (VLPs) can also be used as adjuvants in the invention. These 
structures generally contain one or more proteins from a vims optionally combined or formulated with 
a phospholipid. They are generally non-pathogenic, non-replicating and generally do not contain any 
of the native viral genome. The viral proteins may be recombinant^ produced or isolated from whole 
viruses. These viral proteins suitable for use in virosomes or VLPs include proteins derived from 

20 influenza virus (such as HA or NA), Hepatitis B virus (such as core or capsid proteins), Hepatitis £ 
virus, measles virus, Sindbis virus, Rotavirus, Foot-and-Mouth Disease virus, Retrovirus, Norwalk 
virus, human Papilloma virus, HIV, RN A-phages, QB-phage (such as coat proteins), G A-phage, fir- 
phage, AP205 phage, and Ty (such as retrotransposon Ty protein pi). VLPs are discussed further in 
WO 03/024480, WO 03/024481, and Refc. 26, 27, 28 and 29. Virosomes are discussed further in, for 

25 example, Ref. 30 

D. Bacterial or Microbial Derivatives 

Adjuvants suitable for use in the invention include bacterial or microbial derivatives such as: 

( 1 ) Non-toxic derivatives of enterobacterial lipopolysaccharide (LPS) 

Such derivatives include Monophosphoryl lipid A (MPL) and 3-O-deacylated MPL (3dMPL). 
30 3dMPL is a mixture of 3 De-O-acylated monophosphoryl lipid A with 4, 5 or 6 acylated chains. A 
preferred "small particle" form of 3 De-O-acylated monophosphoryl lipid A is disclosed in EP 0 689 
454. Such "small particles'* of 3dMPL are small enough to be sterile filtered through a 0.22 micron 
membrane (see EP 0 689 454). Other non-toxic LPS derivatives include monophosphoryl lipid A 
mimics, such as aminoalkyl glucosaminide phosphate derivatives e.g. RC-529. See Ref. 31. 

35 (2) Lipid A Derivatives 

Lipid A derivatives include deri vati ves of lipid A from Escherichia coli such as OM- 1 74. OM- 1 74 is 
described for example in Ref. 32 and 33. 

(3) Immunostimulatory oligonucleotides 
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Immunostimulatory oligonucleotides suitable for use as adjuvants in the invention include nucleotide 
sequences containing a CjpG motif (a sequence containing an unmethylated cytosine followed by 
guanosine and linked by a phosphate bond). Bacterial double stranded RNA or oligonucleotides 
containing palindromic or poly(dG) sequences have also been shown to be immunostimulatory. 

5 The CpG's can include nucleotide modifications/analogs such as phosphorothioate modifications and 
can be double-stranded or single-stranded. Optionally, the guanosine may be replaced with an analog 
such as 2 > -deoxy-7-deazaguanosine. See ref. 34, WO 02/26757 and WO 99/62923 for examples of 
possible analog substitutions. The adjuvant effect of CpG oligonucleotides is further discussed in 
Refs. 35, 36, WO 98/40100, U.S. Patent No. 6,207,646, U.S. Patent No. 6,239,1 16, and U.S. Patent 

10 No. 6,429,199. 

The CpG sequence may be directed to TLR9, such as the motif GTCGTT or TTCGTT. See ref. 37. 
The CpG sequence may be specific for inducing a Thl immune response, such as a CpG- A ODN, or it 
may be more specific for inducing a B cell response, such a CpG-B ODN. CpG-A and CpG-B ODNs 
are discussed in refs. 38, 39 and WO 01/95935. Preferably, the CpG is a CpG-A ODN. 

1 5 Preferably, the CpG oligpnucleotide is constructed so that the 5* end is accessible for receptor 

recognition. Optionally, two CpG oligonucleotide sequences may be attached at their 3* ends to form 
"immunomers". See, for example, refs. 40, 41, 42 and WO 03/035836. 

(4) ADP-ribosylating toxins and detoxified derivatives thereof. 

Bacteria] ADP-ribosylating toxins and detoxified derivatives thereof may be used as adjuvants in the 
20 invention. Preferably, the protein is derived from E. coli (i.e., E. coli heat labile enterotoxin "LT), 
cholera C*CT")> or pertussis ("PT"). The use of detoxified ADP-ribosylating toxins as mucosal 
adjuvants is described in WO 95/1 72 1 1 and as parenteral adjuvants in WO 98/42375. Preferably, the 
adjuvant is a detoxified LT mutant such as LT-K63. 

E. Human Immunomodulators 

25 Human immunomodulators suitable for use as adjuvants in the invention include cytokines, such as 
interleukins (eg. IL-1 , IL-2, IL-4, IL-5, IL-6, IL-7, IL-12, etc.), interferons (e.g. interferon-)), 
macrophage colony stimulating factor, and tumor necrosis factor. 

F. Bioadhesives and Mucoadhesives 

Bioadhesives and mucoadhesives may also be used as adjuvants in the invention. Suitable 
30 bioadhesives include esterified hyaluronic acid microspheres (Ref. 43) or mucoadhesives such as 
cross-linked derivatives of poly(acrylic acid), polyvinyl alcohol, polyvinyl pyrollidone, 
polysaccharides and carboxymethylcellulose. Chitosan and derivatives thereof may also be used as 
adjuvants in the invention. E.g., ref. 44. 

G. Microparticles 

35 Microparticles may also be used as adjuvants in the invention. Microparticles {Le. a particle of 

-lOOnm to -150/im in diameter, more preferably ~200nm to -30/im in diameter, and most preferably 

~500nm to -1 0/un in diameter) formed from materials that are biodegradable and non-toxic (&g. a 

poly(of-hydroxy acid), a polyhydroxybutyric acid, a polyorthoester, a polyanhydride, a 

polycaprolactone, etc.), with poly(lacUde-co-glycolide) are preferred, optionally treated to have a 
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negatively-charged surface (eg. with SDS) or a positively-charged surface (eg. with a caiionic 
detergent, such as CTAB). 

Examples of liposome formulations suitable for use as adjuvants are described in U.S. Patent No. 

5 6,090,406, U.S. Patent No. 5,916,588, and EP 0 626 169. 

I. Polvoxvethvlene ether and Polvoxvethvlenc Ester Formulations 

Adjuvants suitable for use in the invention include polyoxyethylene ethers and polyoxyethylene 
esters. Ref. 45. Such formulations further include polyoxyethylene sorbitan ester surfactants in 
combination with an octoxynol (Ref. 46) as well as polyoxyethylene alkyl ethers or ester surfactants 
10 in combination with at least one additional non-ionic surfactant such as an octoxynol (Ref. 47). 

Preferred polyoxyethylene ethers are selected from the following group: polyoxyethylene-9-lauryl 
ether (laureth 9), poIyoxyethylene-9-steoryl ether, polyoxytheylene-8-steoryl ether, polyoxyethylene- 
4-lauryl ether, polyoxyethylene-35-lauryl ether, and polyoxyethylene-23-lauryl ether, 

J. Polyphosphazene (PCPP) 

1 5 PCPP formulations are described, for example, in Ref. 48 and 49. 

K. Mwqnyl peptides 

Examples of muramyl peptides suitable for use as adjuvants in the invention include N-acetyl- 
muramyl-L-threonyl-D-isoglutamine (thr-MDP), N-acetyl-normuramyl-L-alanyl-D-isogJutamine (nor- 
MDP), and N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-d^ 
20 hydroxyphosphoryloxy)-ethylaniine MTP-PE). 

L. Imidazoquinolone Compounds . 

Examples of imidazoquinolone compounds suitable for use adjuvants in the invention include 
Imiquamod and its homologues, described further in Ref. 50 and 5 1 . 

The invention may also comprise combinations of aspects of one or more of the adjuvants identified 
25 above. For example, the following adjuvant compositions may be used in the invention: 

(1) a saponin and an oil-in-water emulsion (ref. 52); 

(2) a saponin (e.g.., QS21) + a non-toxic LPS derivative (e.g., 3dMPL) (see WO 
94/00153); 

(3) a saponin (e.g.., QS21) + a non-toxic LPS derivative (e.g., 3dMPL) + a cholesterol; 
30 (4) a saponin (eg. QS21) + 3dMPL + IL-12 (optionally + a sterol) (Ref. 53); 

combinations of 3dMPL with, for example, QS21 and/or oil-in-water emulsions (Ref. 54); 

(5) SAF, containing 10% Squalane, 0.4% Tween 80, 5% pluronic-block polymer L121, 
and thr-MDP, either microfluidized into a submicron emulsion or vortexed to generate a larger 
particle size emulsion. 

35 (6) Ribi™ adjuvant system (RAS), (Ribi Immunochem) containing 2% Squalene, 0.2% 

Tween 80, and one or more bacterial cell wall components from the group consisting of 
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monophosphory lipid A (MPL), trehalose dimycolate (TDM), and cell wall skeleton (CWS) 9 
preferably MPL + CWS (Detox™); and 

(7) one or more mineral salts (such as an aluminum salt) + a non-toxic derivative of LPS 
(such as 3dPML). 

5 Aluminium salts and MF59 are preferred adjuvants for parenteral immunisation. Mutant bacterial 
toxins are preferred mucosal adjuvants. 

The composition may include an antibiotic. 
Further antigens 

The compositions of the invention may further comprise one or more additional non-GAS antigens, 
10 including additional bacterial, viral or parasitic antigens. 

In one embodiment, the GAS antigen combinations of the invention are combined with one or more 
additional, non-GAS antigens suitable for use in a paediatric vaccine. For example, the GAS antigen 
combinations may be combined with one or more antigens derived from a bacteria or virus selected 
from the group consisting of N. meningitidis (including serogroup A, B, C, W135 and/or Y), 
15 Streptococcus pneumoniae, Bordetella pertussis, Moraxella catarrhalis, Tetanus, Diphtheria, 
Respiratory Syncytial virus CRSV), polio, measles, mumps, rubella, and rotavirus. 

In another embodiment, the GAS antigen combinations of the invention are combined with one or 
more additional, non-GAS antigens suitable for use in a vaccine designed to protect elderly or 
immunocomprised individuals. For example, the GAS antigen combinations may be combined 
20 with an antigen derived from the group consisting of Enterococcus faecalis, Staphylococcus 
aureus, Staphylococcus epidermis, Pseudomonas aeruginosa, Legionella pneumophila, Listeria 

* 

monocytogenes, influenza, and Parainfluenza virus fPIV 9 ). 

Where a saccharide or carbohydrate antigen is used, it is preferably conjugated to a carrier protein in 
order to enhance immunogenicity {e.g. refs. 55 to 64}. Preferred carrier proteins are bacterial toxins 

25 or toxoids, such as diphtheria or tetanus toxoids. The CRMw diphtheria toxoid is particularly 

preferred {65}. Other carrier polypeptides include the N. meningitidis outer membrane protein {66}, 
synthetic peptides {67, 68}, heat shock proteins {69, 70}, pertussis proteins {71, 72}, protein D from 
H.influenzae {73}, cytokines {74}, lymphokines, hormones, growth factors, toxin A or B from 
Cdifficile {75}, iron-uptake proteins {76}, etc. Where a mixture comprises capsular saccharides from 

30 both serogroups A and C, it may be preferred that the ratio (w/w) of MenA saccharide: MenC 
saccharide is greater than 1 (eg. 2: 1 , 3: 1 , 4: 1 , 5: 1 , 10: 1 or higher). Different saccharides can be 
conjugated to the same or different type of carrier protein. Any suitable conjugation reaction can be 
used, with any suitable linker where necessary. 

Toxic protein antigens may be detoxified where necessary e.g. detoxification of pertussis toxin by 
35 chemical and/or genetic means. 
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Where a diphtheria antigen is included in the composition it is preferred also to include tetanus 
antigen and pertussis antigens. Similarly, where a tetanus antigen is included it is preferred also to 
include diphtheria and pertussis antigens. Similarly, where a pertussis antigen is included it is 
preferred also to include diphtheria and tetanus antigens. 

S Antigens in the composition will typically be present at a concentration of at least 1 fig/ml each. In 
general, the concentration of any given antigen will be sufficient to elicit an immune response against 
that antigen. 

As an alternative to using protein antigens in the composition of the invention, nucleic acid encoding 
the antigen may be used {eg. refs. 77 to 85}. Protein components of the compositions of the 
1 0 invention may thus be replaced by nucleic acid (preferably ON A eg. in the form of a plasmid) that 
encodes the protein. 

Definitions 

The term "comprising" means "including" as well as "consisting" eg. a composition "comprising" X 
may consist exclusively of X or may include something additional eg. X + Y. 

1 S The term "about" in relation to a numerical value x means, for example, x±\ 0%. 

References to a percentage sequence identity between two amino acid sequences means that, when 
aligned, that percentage of amino acids are the same in comparing the two sequences. This alignment 
and the percent homology or sequence identity can be determined using software programs known in 
the art, for example those described in section 7.7.18 of reference 86. A preferred alignment is 
20 determined by the Smith-Waterman homology search algorithm using an affine gap search with a gap 
open penalty of 1 2 and a gap extension penalty of 2, BLOSUM matrix of 62. The Smith-Waterman 
homology search algorithm is disclosed in reference 87. 

The following example demonstrates one way of preparing recombinant GAS antigens of the 
invention and testing their efficacy in a murine model. 

25 EXAMPLE 1: Preparation of recombinant GAS antigens 

of the invention and Demonstration of Efficacy in Murine Model. 

■ 

Recombinant GAS proteins corresponding to two or more of the GAS antigens of the first antigen 
group are expressed as follows. 

i 

30 1. Cloning of GAS antigens for expression in E. coli 

The selected GAS antigens were cloned in such a way to obtain two different kinds of 
recombinant proteins: (1) proteins having an hexa-histidine tag at the carboxy-terminus (Gas-His) 
and (2) proteins having the hexa-histidine tag at the carboxy-terminus and GST at the amino- 
terminus (Gst-Gas-His). Type (1) proteins were obtained by cloning in a pET2Ib+vector 
35 (available from Novagen). The type (2) proteins were obtained by cloning in a pGEX-NNH 
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vector. This cloning strategy allowed for the GAS genomic DNA to be used to amplify the 
selected genes by PCR, to perform a single restriction enzyme digestion of the PCR products and 
to clone then simultaneously into both vectors. 

(a) Construction of pCEX-NNH expression vectors 

5 Two couples of complementary oligodeoxyribonucleotides are synthesised using the DNA synthesiser 
ABI394 (Pcrkin Elmer) and reagents from Cruachem (Glasgow, Scotland). Equiraolar amounts of the 
oligo pairs (50 ng each oligo) are annealed in T4 DNA ligase buffer (New England Biolabs) for 10 
min in a final volume of SO /d and then left to cool slowly at room temperature. With the described 
procedure the following DNA linkers are obtained: 

10 gexNN linker 

Ndel Nhel Xtnal BcoRI Ncol Sail Xhol Sad 

GATCCCATATGGCTAGCCCGGGGAATTCGTCCATTC 

GGTATACCGATCGCX^CCCnTAAGCAGGTACCrCACTCAGCTGACTGAG 

15 MotI 

CTGAGCGGCCGCATGAA 
GACTCGCCGGCGTACTTTCGA 

gexNNH Unker 

20 Hindlll Not] Xhol Hexa-Histidine 

TCGACAAGCTTGCGGCCGCACTCGAGCATCACCATCACCATCACTGAT 

GTTCGAACGCCGGCGTGAGCAC0TAGAG0TAGT6GTAGTGACTATCGA 

The plasmid pGEX-KG [K. L. Guan and J. E. Dixon, Anal Biochem. 192, 262 (1991)] is digested 
25 with BamHI and Hindlll and 1 00 ng is ligated overnight at 1 6 °C to the linker gexNN with a molar 
ratio of 3:1 linker/plasmid using 200 units of T4 DNA ligase (New england Biolabs). After 
transformation of the ligation product in E. coli DH5, a clone containing the pGEX-NN plasmid, 
having the correct linker, is selected by means of restriction enzyme analysis and DNA sequencing. 
The new plasmid pGEX-NN is digested with Sail and Hindlll and ligated to the linker gexNNH. After 
30 transformation of the ligation product in E. coli DH5 , a clone containing the pGEX-NNH plasmid, 
having the correct linker, is selected by means of restriction enzyme analysis and DNA sequencing. 

(b) Chromosomal DNA preparation 

GAS SF370 strain is grown in THY medium until OD^o is 0.6-0.8. Bacteria are then centrifiiged, 

♦ 

suspended in TES buffer with lyzozyme (10mg/ml) and mutanolysine (10U/^1) and incubated 1 hr at 
35 37° C. Following treatment of the bacterial suspension with RNAase, Proteinase K and 1 0% 

Sarcosyl/EDTA, protein extraction with saturated phenol and phenol/chloroform is carried out. The 
resulting supernatant is precipitated with Sodium Acetate/Ethanol and the extracted DNA is pelletted 
by centrifiigation, suspended in Tris buffer and kept at -20° C. 
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(c) Oligonucleotide design 
Synthetic oligonucleotide primers are designed on the basis of the coding sequence of each GAS 
antigen using the sequence of Streptococcus pyogenes SF370 Ml strain* Any predicted signal peptide 
is omitted, by deducing the 5' end amplification primer sequence immediately downstream from the 
S predicted leader sequence. For most GAS antigens, the 5* tail of the primers (see Table 1, below) 
include only one restriction enzyme recognition site (Ndel, or Nhel, or Spel depending on the gene's 
own restriction pattern); the 3* primer tails (see Table 1) include a Xhol or a NotI or a Hindm 
restriction site. 



5' tails 


3' tails 


Ndel 5' GTGCGTCATATG 3* 


Xhol 5* GCGTCTCGAG 3' 


Nhel 5' GTGCGTGCTAGC 3' 


NotI 5* ACTCGCTAGCGGCCGC 3* 


Spel 5' GTGCGTACTAGT 3' 


Hindm 5' GCGTAAGCTT 3' 



Table 1. Oligonucleotide tails of the primers used to amplify genes encoding selected GAS 
10 antigens. 

As well as containing the restriction enzyme recognition sequences, the primers include nucleotides 
which hybridize to the sequence to be amplified. The number of hybridizing nucleotides depends on 
the melting temperature of the primers which can be determined as described [(Breslauer et al., Proc. 
Nat. Acad. Sci. 83, 3746-50 (1986 )). Hie average melting temperature of the selected oligos is 50-55 

1 5 °C for the hybridizing region alone and 65-75 °C for the whole oligos. Oligos can be purchased from 
MWG-Biotech S.p.A. (Firenze, Italy). 

(d) PCR amplification 
The standard PCR protocol is as follows: 50 ng genomic DNA are used as template in the presence of 
0,2 pM each primer, 200 pM each dNTP, IJS mM MgCl 2 , lx PCR buffer minus Mg (Gibco-BRL), 

20 and 2 units of Taq DNA polymerase (Platinum Taq, Gibco-BRL) in a final volume of 1 00 pi Each 

sample undergoes a double-step amplification: the first 5 cycles are performed using as the 

hybridizing temperature of one of the oligos excluding the restriction enzyme tail, followed by 25 

cycles performed according to the hybridization temperature of the whole length primers. The 

standard cycles are as follows: 

25 one cycle: 

denaturation : 94 °C, 2 min 

5 cycles: ^ 

denaturation: 94 °C, 30 seconds, hybridization: $J °C, 50 seconds, elongation: 72 °C, 1 min or 
30 2 min and 40 sec 

25 cycles: 

denaturation: 94 °C, 30 seconds 
hybridization: 70 °C, 50 seconds 
35 elongation: 72 °C, 1 min or 2 min and 40 sec 
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72 °C, 7 min 
4°C 

The elongation time is 1 min for GAS antigens encoded by ORFs shorter than 2000 bp, and 2 min and 
40 seconds for ORFs longer than 2000 bp. The amplifications are performed using a Gene Amp PCR 

5 system 9600 (Perkin Elmer). 

To check the amplification results, 4 pi of each PCR product is loaded onto 1-1 .5 agarose gel and the 
size of amplified fragments compared with DNA molecular weight standards (DN A markers in or IX t 
Roche). The PCR products are loaded on agarose gel and after electrophoresis the right size bands are 
excised from the gel. The DNA is purified from the agarose using the Gel Extraction Kit (Qiagen) 

10 following the instruction of the manufacturer. The final elution volume of the DNA is 50 pi TE (10 
mM Tris-HCl, 1 mM EDTA, pH 8). One pi of each purified DNA is loaded onto agarose gel to 
evaluate the yield. 

(e) Digestion of PCR fragments 

One-two jig of purified PCR products are double digested overnight at 37 °C with the appropriate 
I 5 restriction enzymes (60 units of each enzyme) using the appropriate restriction buffer in 100 pi final 
volume. The restriction enzymes and the digestion buffers are from New England Biolabs. After 
purification of the digested DNA (PCR purification Kit, Qiagen) and elution with 30 pi TE, 1 pi is 
subjected to agarose gel electrophoresis to evaluate the yield in comparison to titrated molecular 
weight standards (DNA markers HI or IX, Roche). 

20 (f) Digestion of the cloning vectors (pET21b+ and pGEX-NNH) 

10 pg of ptasmid is double digested with 100 units of each restriction enzyme in 400 pi reaction 
volume in the presence of appropriate buffer by overnight incubation at 37 °C. After electrophoresis 
on a 1 % agarose gel, the band corresponding to the digested vector is purified from the gel using the 
Qiagen Qiaex II Gel Extraction Kit and the DNA was eluted with 50 pi TE. The DNA concentration 

25 is evaluated by measuring OD260 of the sample. 

(g) Cloning of the PCR products 
Seventy five ng of the appropriately digested and purified vectors and the digested and purified 
fragments corresponding to each selected GAS antigen are ligated in final volumes of 10*20 pi with a 
molar ratio of 1 : 1 fragment/vector, using 400 units T4 DNA ligase (New England Biolabs) in the 

30 presence of the buffer supplied by the manufacturer. The reactions are incubated overnight at 16 °C. 
Transformation of E coli BL21 (Novagen) and E coli BL2 1-DE3 (Novagen) electrocompetent cells is 
performed using pGEX-NNH ligations and pET21b+ ligations respectively. The transformation 
procedure is as follows: 1-2 pi the ligation reaction is mixed with 50 pi of ice cold competent cells, 
then the cells are poured in a gene pulser 0. 1 cm electrode cuvette (Biorad). After pulsing the cells in 

35 a MicroPulser electroporator (Biorad) following the manufacturer instructions the cells are suspended 
in 0.95 ml of SOC medium and incubated for 45 min at 37 °C under shaking. 100 and 900 pi of cell 
suspensions are plated on separate plates of agar LB 100 pg/ml Ampicillin and the plates are 



incubated overnight at 37 °C. The screening of the transformants is done by PCR: randomly chosen 
transformants are picked and suspended in 30 pi of PCR reaction mix containing the PCR buffer, the 
4 dNTPs, 1,5 mM MgClj. Taq polymerase and appropriate forward and reverse oligonucleotide 
primers that are able to hibridize upstream and downstream from the polylinker of pET21b+ or 
S pGEX-NNH vectors. After 30 cycles of PCR, 5 pi of the resulting products are run on agarose gel 
electrophoresis in order to select for positive clones from which the expected PCR band is obtained. 
PCR positive clones are chosen on the basis of the correct size of the PCR product, as evaluated by 
comparison with appropriate molecular weight markers (DNA markers III or IX, Roche). 

2. Protein expression 

10 PCR positive colonies are inoculated in 3 ml LB 100 pg/ml Ampicillin and grown at 37 °C overnight 
70 pi of the overnight culture is inoculated in 2 ml LB/Amp and grown at 37 °C until OD^o of the 
pET clones reached the 0,4-0,8 value or until OD m of the pGEX clones reached the 0,8-1 value. 
Protein expression is then induced by adding 1 mM 1PTG (Isopropil /3-D thio-galacto-piranoside) to 
the mini-cultures. After 3 hours incubation at 37 °C the final ODoo is checked and the cultures are 

1 5 cooled on ice. After centrifugation of 0.S ml culture, the cell pellet is suspended in SO fit of protein 
Loading Sample Buffer (60 mM TRIS-HCI pH 6.8, 5% w/v SDS, 10% v/v glycerin, 0.1% w/v 
Bromophenol Blue, 100 mM DTT) and incubated at 100 °C for 5 min. A volume of boiled sample 
corresponding to 0.1 ODmo culture is analysed by SDS-PAGE and Coomassie Blue staining to verify 
the presence of induced protein band 

20 3. Purification of the recombinant proteins 

Single colonies are inoculated in 25 ml LB 100 pg/ml Ampicillin and grown at 37 °C overnight The 
overnight culture is inoculated in 500 ml LB/Amp and grown under shaking at 25 °C until OD^o 0.4- 
0.7. Protein expression is then induced by adding 1 mM IPTG to the cultures. After 3.5 hours 
incubation at 25 °C the final OD^ is checked and the cultures are cooled on ice. After centrifugation 

25 at 6000 rpm (JA10 rotor, Beckman), the cell pellet is processed for purification or frozen at -20° C. 

(a) Procedure for the purification of soluble His-tagged proteins from Exoli 
(1) Transfer the pellets from -20°C to ice bath and reconstitute with 10 ml 50 mM NaHP0 4 buffer, 
300 mM NaCl, pH 8,0, pass in 40-50 ml centrifugation tubes and break the cells as per the following 
outline. 

30 (2) Break the pellets in the French Press performing three passages with in-line washing. 

(3) Centrifuge at about 30-40000 x g per 15-20 min. If possible use rotor JA 25.50 (21000 rpm, 15 
min.) or JA-20 (18000 rpm, 15 min.) 

(4) Equilibrate the Poly-Prep columns with 1 ml Fast Flow Chelating Sepharose resin with 50 mM 
phosphate buffer, 300 mM Nad, pH 8,0. 

35 (5) Store the centrifugation pellet at -20°C, and load the supernatant in the columns. 
(6) Collect the flow through. 
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(7) Wash the columns with 10 ml (2 ml + 2 ml + 4 ml) SO mM phosphate buffer, 300 mM NaCI, pH 
8.0. 

(8) Wash again with 10 ml 20 mM imidazole buffer, SO mM phosphate, 300 mM NaCl, pH 8.0. 

(9) Bute the proteins bound to the columns with 4.5 ml (1 .5 ml + 1 .5 ml + 1 .5 ml) 2S0 mM imidazole 
5 buffer, 50 mM phosphate, 300 mM NaCl, pH 8,0 and collect the 3 corresponding fractions of -1 .5 ml 

each. Add to each tube 1 5 pi DTT 200 mM (final concentration 2 mM) 

(10) Measure the protein concentration of the first two fractions with the Bradford method, collect a 
1 0 jig aliquot of proteins from each sample and analyse by SDS-PAGE. fN.B.: should the sample be 
too diluted, load 21 \il + 7 \xl loading buffer). 

10 (11) Store the collected fractions at +4°C while waiting for the results of the SDS-PAGE analysis. 

(12) For immunisation prepare 4*5 aliquots of 100 jig each in 0.5 ml in 40% glycerol. The dilution 

buffer is the above elution buffer, plus 2 mM DTT. Store the aliquots at -20°C until immunisation, 
(b) Purification of His-tagged proteins from Inclusion bodies 

Purifications are carried out essentially according the following protocol: 
15 (1 ) Bacteria are collected from 500 ml cultures by centrifugation. If required store bacterial pellets at 

-20°C For extraction, resuspend each bacterial pellet in 10 ml 50 mM TRIS-HC1 buffer, pH 8,5 on 

an ice bath. 

(2) Disrupt the resuspended bacteria with a French Press, performing two passages. 

(3) Centrifuge at 35000 x g for 15 min and collect the pellets. Use a Beckman rotor JA 25.50 (21000 
20 rpm, 15 min.) or JA-20 (18000 rpm, 15 min.). 

(4) Dissolve the centrifugation pellets with 50 mM TRIS-HC1, 1 mM TCEP {Tris(2-carboxyethyl)- 
phosphine hydrochloride, Pierce) , 6M guanidium chloride, pH 8.5. Stir for ~ 10 min. with a magnetic 
bar. 

(5) Centrifuge as described above, and collect the supernatant. 

25 (6) Prepare an adequate number of Poly-Prep (Bio-Rad) columns containing 1 ml of Fast Flow 

Chelating Sepharose (Pharmacia) saturated with Nichel according to manufacturer recommendations.. 
Wash the columns twice with 5 ml of H 2 0 and equilibrate with 50 mM TRIS-HC1, 1 mM TCEP, 6M 
guanidinium chloride, pH 8.5. 

(7) Load the supernatants from step 5 onto the columns, and wash with 5 ml of 50 mM TRIS-Hcl 
30 buffer, 1 mM TCEP, 6M urea, pH 8.5 

(8) Wash the columns with 10 ml of 20 mM imidazole, 50 mM TRIS-HC1 , 6M urea, 1 mM TCEP, 
pH 8.5. Collect and set aside the first 5 ml fen: possible further controls. 

(9) Elute the proteins bound to the columns with 4.5 ml of a buffer containing 250 mM imidazole, 50 
mM TRIS-HC1, 6M urea, 1 mM TCEP, pH 8.S. Add the elution buffer in three 1 .5 ml aliquots, and 

35 collect the corresponding 3 fractions. Add to each fraction 15 nl'DTT (final concentration 2 mM). 

(10) Measure eluted protein concentration with the Bradford method, and analyse aliquots of ca 10 
l^g of protein by SDS-PAGE 
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(1 1) Store proteins at -20°C in 40% (v/v) glycerol, SO mM TR1S-HCI, 2M urea, 0.5 M arginine, 2 
mM DTT, 0.3 mM TCEP, 83.3 mM imidazole, pH 8.5. 

(c) Procedure for the purification of GST-fusion proteins from E.coli 

(1) Transfer the bacterial pellets from -20°C to an ice bath and suspend with 7,5 ml PBS, pH 7,4 to 
5 which a mixture of protease inhibitors (COMPLETE™ - Boehringer Mannheim, 1 tablet every 25 ml 

of buffer) has been added. 

(2) Transfer to 40-50 ml centrifugation tubes and sonicate according to the following procedure: 

a. Position the probe at about 0,5 cm from the bottom of the tube 

b. Block the tube with the clamp 
10 c. Dip the tube in an ice bath 

d. Set the sonicator as follows: Timer -> Hold, Duty Cycle -» 55, Out. Control 6. 

e. perform S cycles of 10 impulses at a time lapse of 1 minute (i.e. one cycle = 10 impulses + -45" 
hold; b. 10 impulses + -45" hold; c. 10 impulses + -45" hold; d. 10 impulses + -45" hold; e. 10 
impulses + -45" hold). 

15 

(3) Centrifuge at about 3040000 x g for 15*20 min. E.g.: use rotor Beckman JA 25.50 at 21000 ipm, 
for 15 min. 

(4) Store the centrifugation pellets at -20°C, and load the supernatants on the chromatography 
columns, as follows 

20 (5) Equilibrate the Poly-Prep (Bio-Rad) columns with 0,5 ml (a 1 ml suspension) of Glutathione- 
Sepharose 4B resin, wash with 2 ml (1 + 1) HA and then with 10 ml (2 + 4 + 4) PBS, pH 7,4. 

(6) Load the supernatants on the columns and discard the flow through. 

(7) Wash the columns with 10 ml (2 + 4 + 4) PBS, pH 7.4. 

(8) Elute the proteins bound to the columns with 4.5 ml of 50 mM TRIS buffer, 10 mM reduced 

25 glutathione, pH 8.0, adding 1.5ml + ].5m)+].5mland collecting the respective 3 fractions of -1 .5 
ml each. 

(9) Measure the protein concentration of the first two fractions with the Bradford method, analyse a 
10 jig aliquot of proteins from each sample by SDS-PAGE. (N.B.: if the sample is too diluted load 21 
jil (+ 7 }il loading buffer). 

30 (10) Store the collected fractions at +4°C while waiting for the results of the SDS-PAGE analysis. 
(11) For each protein destined to the immunisation prepare 4-5 aliquots of 100 jig each in 0.5 ml of 
40% glycerol. The dilution buffer is 50 mM TRIS.HC1, 2 mM DTT, pH 8.0. Store the aliquots at - 
20°C until immunisation. 

4. Murine Model of Protection from GAS Infection 
35 (a) Immunization protocol 

Groups of 10 CD1 female mice aged between 6 and 7 weeks are immunized with two or more GAS 
antigens of the invention, (20 fig of each recombinant GAS antigen), suspended in 100 pi of suitable 
solution. Each group receives 3 doses at days 0, 2 1 and 45. Immunization is performed through intra- 
peritoneal injection of the protein with an equal volume of Complete Freund's Adjuvant (CFA) for the 
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first dose and Incomplete Freund's Adjuvant 0FA) for the following two doses. In each immunization 
scheme negative and positive control groups are used. 

For the negative control group, mice are immunized with £ coli proteins eluted from the purification 
columns following processing of total bacterial extract from a E. coli strain containing either the 
5 pET2 lb or the pGEX-NNH vector (thus expressing GST only) without any cloned OAS ORF (groups 
can be indicated as HisStop or GSTStop respectively). 

For the positive control groups, mice are immunized with purified GAS M cloned from either GAS 
SF370 or GAS DSM 2071 strains (groups indicated as 192SF and 192DSM respectively). 
Pooled sera from each group is collected before the first immunization and two weeks after the last 
10 one. Mice are infected with GAS about a week after. 

Immunized mice are infected using a GAS strain different from that used for the cloning of the 
selected proteins. For example, the GAS strain can be DSM 2071 M23 type, obtainable from the 
German Collection of Microorganisms and Cell Cultures (DSMZ). 

For infection experiments, DSM 2071 is grown at 37° C in THY broth until OD«oo 0.4. Bacteria are 

1 S pelletted by centrifiigation, washed once with PBS, suspended and diluted with PBS to obtain the 
appropriate concentration of bacteria/ml and administered to mice by intraperitoneal injection. 
Between 50 and 1 00 bacteria are given to each mouse, as determined by plating aliquots of the 
bacterial suspension on 5 THY plates. Animals are observed daily and checked for survival. 
5. Analysis of Immune Sera 

20 (a) Preparation of GAS total protein extracts 

Total protein extracts are prepared by incubating a bacterial culture grown to OD^ 0.4-0.5 in Tris 
50mM pH 6.8/mutanolysin (20 units/ml) for 2 hr at 37° C, followed by incubation for ten minutes on 
ice in 0.24 N NaOH and 0.96% (J-mercaptoethanol. The extracted proteins are precipitated by 
addition of trichloroaceticacid, washed with ice-cold acetone and suspended in protein loading buffer. 

25 (b) Western blot analysis 

Aliquots of total protein extract mixed with SDS loading buffer (Ix: 60 mM TRIS-HC1 pH 6.8, 5% 
w/v SDS, 10% v/v glycerin, 0. 1% Bromophenol Blue, 100 mM DTI) and boiled 5 minutes at 95' C, 
were loaded on a 12.5% SDS-PAGE precast gel (Biorad). The gel is run using a SDS-PAGE running 
buffer containing 250 mM TRIS, 2.5 mM Glycine and 0. 1 %SDS. The gel is electroblotted onto 

30 nitrocellulose membrane at 200 mA for 60 minutes. The membrane is blocked for 60 minutes with 
PBS/0.05 % Tween-20 (Sigma), 10% skimmed milk powder and incubated O/N at 4* C with 
PBS/0.05 % Tween 20, 1% skimmed milk powder, with the appropriate dilution of the sera. After 
washing twice with PBS/0.05 % Tween, the membrane is incubated for 2 hours with peroxidase- 
conjugated secondary anti-mouse antibody ( Amersham) diluted 1 :4000. The nitrocellulose is washed 

35 three times for 10 minutes with PBS/0.05 % Tween and once with PBS and thereafter developed by 
Opti-4CN Substrate Kit (Biorad). 

(c) Preparation of Paraformaldehyde treated GAS cultures 
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A bacterial culture grown to OD«o 0.4-0.5 is washed once with PBS and concentrated four times in 
PBS/0.05 % Paraformaldehyde. Following 1 hr incubation at 37° C with shacking, the treated culture 
is kept overnight at 4° C and complete inactivation of bacteria is then controlled by plating aliquots on 
THY blood agar plates. 

5 (d) FACS analysis of Paraformaldehyde treated GAS cottures with mouse immune sera 

About 10* Paraformaldehyde inactivated bacteria are washed with 200 pi of PBS in a 96 wells U 
bottom plate and centrifuged for 10 min. at 3000g, at 4°C The supernatant is discarded and the 
bacteria are suspended in 20 pi of PBS-0. 1 %BS A. Eighty |d of either pre -immune or immune mouse 
sera diluted in PBS-0.1%BSA are added to the bacterial suspension to a final dilution of either 1: 100, 

10 . 1:250 or 1:500, and incubated on ice for 30 min. Bacteria are washed once by adding 100 fA of PBS- 
0.1%BSA, centrifuged for 10 min. at 3000g, 4°C, suspended in 200 \x\ of PBS-0.1%BSA, centrifuged 
again and suspended in 10 |il of Goat Anti-Mouse IgG» F(ab')j fragment specific-R-Pbycoerythrin- 
conjugated (Jackson Immunoresearch Laboratories Inc., cat.N°l 15-1 16-072) in PBS-0.1%BSA to a 
final dilution of 1 : 100, and incubated on ice for 30 min. in the dark. Bacteria are washed once by 

15 adding 180 pi of PBS-0.1%BSA and centrifuged for 10 min. at 3000& 4°C. The supernatant is 

discarded and the bacteria were suspended in 200 jd of PBS. Bacterial suspension is passed through a 
cytometric chamber of a FACS Calibur (Becton Dikinson, Mountain View, CA USA) and 10.000 
events are acquired. Data are analysed using Cell Quest Software (Becton Dikinson, Mountain View, 
CA USA) by drawing a morphological dot plot (using forward and side scatter parameters) on 

20 bacterial signals. An histogram plot is then created on FL2 intensity of fluorescence log scale 
recalling the morphological region of bacteria. 

It will be understood that the invention has been described by way of example only and 
modifications may be made whilst remaining within the scope and spirit of the invention. 
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